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Foreword 


Dr. Jeffrey Zheng was one of the first postgraduate students supervised by Prof. 
Qingshi Gao (Member, Chinese Academy of Sciences) at the Institute of Computing 
Technology, Chinese Academy of Sciences. I have known Dr. Zheng for 40 years 
since then. Building upon his postgraduate work (Parallel Sorting Algorithm and 0-1 
Transformation), Dr. Zheng has made significant contribution to the field of Variant 
Construction, ranging from theoretical foundations to various applications. His 
research has been published at many academic journals and conferences. For the 
convenience of readers, Dr. Zheng compiled his representative works of 40 years 
into two monographs with complementary contents. I believe that professionals in 
related fields will find this book both an excellent reference and a source of inspi- 
ration. Other readers will enjoy this book as an introduction to topics of Variant 
Construction. I am very happy to recommend this book in the form of a foreword. 


Beijing, China Yunmei Dong 
April 2018 Professor, The Institute of Software 
Chinese Academy of Sciences 

Member, Chinese Academy of Sciences 


As head of the R&D team for Lenovo Chinese Systems, I am very pleased to see 
the research work of former colleague Dr. Jeffrey Zheng, which began 30 years ago 
with the “Smoothly Enlarging Chinese Font Algorithm of 0-1 logic operations” at 
the Institute of Computing Technology of the Chinese Science Academy. His most 
recent work “Variant Construction” is summarized as a professional monograph. 
I expect this new measurement system to be used efficiently for advanced crypto- 
graphic tests in modern cyberspace security. I am pleased to give this foreword. 


Beijing, China Guangnan Ni 
April 2018 Professor, The Institute of Computing Technology 
Chinese Academy of Sciences 

Member, Chinese Academy of Engineering 
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Dr. Jeffrey Zheng and I were in the first group of postgraduates major in Computer 
Architecture at the Graduate School of the Chinese Academy of Sciences 40 years 
ago. Professor Qingshi Gao (Member, Chinese Academy of Sciences) supervised 
him in particular in the areas of parallel algorithm and computer architecture. 

Dr. Jeffrey Zheng is one of the few classmates who continue to works in basic 
research and advanced applications. It is great for Dr. Jeffrey Zheng to collect his 
research work in a monograph. Variant Measurement Technology could be used in 
the next generation of Quantum Cryptographic Communication Services. 

On the occasion of the 40th anniversary of the Graduate School of Chinese 
Academy of Sciences, I would like to express my good wishes as a classmate for 
this monograph in the foreword. 


Beijing, China Guojie Li 
April 2018 Professor, The Institute of Computing Technology 
Chinese Academy of Sciences 

Member, Chinese Academy of Engineering 


Preface 


Associated with the fast development of science and technology in the twenty-first 
century, the modern computer and communication system in optical fiber com- 
munication supporting the global Internet shows profound influence on society and 
economy. As a result, globalization has become an extremely important issue in 
social and economic systems. The Internet and optical fiber communication systems 
have revolutionized the geographic and communication patterns of the world, by 
creating an open era of integrated global Internet connectivity. Quantum key 
communication technology and quantum entanglement experiments on a quantum 
satellite represent typical examples of China’s world-leading science and technol- 
ogy from the perspective of frontier application research. The latest achievements of 
artificial intelligence, which is the lead of Alpha-Go, show the potential intelligence 
prospect of advanced technology based on deep learning, artificial neural networks, 
and knowledge-based support vector machine systems. Related achievements are 
very attractive, such as poetry robots, service robots, industrial robots, face 
recognition, gesture recognition, unmanned aerial vehicles, self-driving cars, and 
unmanned underwater vehicles. A list of military and civilian high-tech achieve- 
ments supports daily life with rich and colorful intelligent products. 

From the viewpoint of mathematics and logics, the foundation framework to 
design and simulate both modern computer systems and optical fiber communi- 
cation networks is dependent on the 0-1 logical system and representations of 
multiple bit states. For integrated circuits, the theoretical basis can be traced back to 
the 1930s. Shannon developed the Boolean algebra to design circuits establishing 
switch circuit theory, Turing proposed the Turing machine, and von Neumann 
established a modern computer architecture. After more than 50 years of devel- 
opment follows Moore’s Law: the observation that the number of transistors in a 
dense integrated circuit doubles approximately every 2 years. Optimization of very 
large-scale integrated circuit technology appears everywhere with evolution of 
magical functions. 
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Looking ahead, the development of advanced science and technology is subject 
to the limitations of basic theory and applications on foundational supports. From 
the perspective of basic research, how we can extend this classical level is a very 
interesting issue and an extremely difficult research topic. 


Purpose of This Book 


After four decades of deep exploration on 0-1 logical systems, the authors expended 
vector 0-1 logical systems to establish a variant logic framework in 2010. After 
further research and development for one decade, three theoretical components 
were established: variant logic, variant measurement, and variant map. At the same 
time, various sample applications were investigated and developed. However, 
because most published papers are scattered in professional journals, conference 
proceedings, and academic books, it is difficult for other people to obtain com- 
prehensive information on the topic. 

In addition, each article may be focused on a specific issue, and it is difficult for 
readers to understand the whole structure from a few papers. We are going to 
organize relevant papers in this book, which will be the first book on variant 
construction with intrinsic logical connections on the selected papers. Selected 
papers are composed of different parts. Based on this architecture, different readers 
can easily access suitable content from specific chapters. 


The Need for a New Logic System 


In modern computer and communication systems, the theory of switch circuits uses 
multiple bits, states, and logic operations for state automata and combinatorial logic 
units to design and implement complex computing and communication systems. 
For solving linear equations with n variables as algebraic equation, Boolean 
equation or differential equation, it is useful to apply a matrix associated with a set 
of eigenvectors. Matrices and eigenvalues are valid to provide solutions on periodic 
problems of special basis in periodic functions or periodic boundary conditions. 
However, it is difficult for periodic models to resolve exhaustive cases on the 
conditions of quasi-periodic, nonperiodic random, and chaotic forms. For example, 
modern cryptographic generation/analysis systems such as block ciphers are 
dependent on a Substitution—Permutation Net (SPN). This type of network con- 
nection on n bit vectors of input/output transformation includes permutation 
operations, where the total number of configuration functions is proportional to 2”!. 
From a measuring viewpoint, cryptographic sequences need to have relevant 
measurements, analysis models, and methods with huge complexity far beyond 
based on state automata and combinational logic circuits. 
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Modern digital computing and communication technologies are based on clas- 
sical logic systems, the global Internet network with huge amounts of data models, 
deep learning, artificial neural networks, and knowledge—based vector support 
machines cannot meet internal states of exponentially increased models. Although 
Fourier transform and wavelet transform are the most important tools for modern 
spectrum analysis, there are significant limitations for this type of periodic schemes 
to process arbitrary random state and aperiodic types of complex functions in big 
data environments. It is difficult for random applications to obtain the convergence 
results. Quantum mechanics and modern photonic-electronic applications are 
confirmed the effectiveness of this frontier science. 

Nobel Prize Winner G. t’Hooft proposed a cellular automaton interpretation of 
quantum mechanics. The research results show that there is a commonplace 
overlapped between classical logic and quantum mechanics, at the Planck scale in 
10 “3 range. It is necessary to use 0-1 vectors in permutation condition to represent 
quantum states. From a counting viewpoint, the complexity of such structures is 
related to 2”!. 

In classical statistics, the Ising model provides an analysis mechanism on 0-1 
states. Based on the assumption of exhaustive states, an exact solution can be 
compared with the average field on one- and two-dimensional lattices. In general, 
whether there is an exact solution under the condition of random permutation 
distribution is an interesting topic worth further exploration. Modern experiments 
made good progress in advanced nanotechnology, fiber optics, laser photonics, and 
ultrafast laser pulse in quantum optics technology. Advanced experiments in nan- 
otechnologies can be used to distinguish a series of the quantum block/surface/line 
and dot macro- to nanostructures, and relevant emission and absorption spectrum 
can be observed. Both wider continuous spectrum of thermal noises and narrower 
discrete spectrum of coherent laser beams are observed. In current research prob- 
lems, the measurement models and methods discussed are far different from the 
quantum scale, and all results can be described in modern probability statistics. 
However, the complex operation associated with the shift operations on the phase 
space of permutations, modern statistical probability methods, and tools have dif- 
ficulties to handle symmetric groups directly with arbitrary random permutation 
requirements. 

The advanced Quantum Key Distribution (QKD), from a stochastic analysis 
viewpoint, needs to have effective measurement model and quantitative method to 
identify the source of a random sequence. Is it generated from a quantum random 
resource as a truly random sequence or a stream cipher as a pseudo-random 
sequence? It is impossible to make a classification use the NIST random testing 
package. This type of targets is also impossible to apply spectrum analysis and 
linear equation tools. More advanced models and methods are required. 

For a 0-1 vector with multiple bits, analysis tools use classical probabilistic 
statistical models and methods. Since the specific problem of randomness testing is 
far beyond the combinatorial analysis and state automata, it is difficult to handle the 
demand of actual measurement and quantitative analysis due to ultra-complexity 
of the substitution and permutation on complicated modes. Similar to modern 
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physics applying classical statistics, it is necessary to establish a solid logic foun- 
dation to support permutation and substitution operations in logic mechanism to 
make extension of analytical frontier to support both theoretical foundation and 
practical applications. 

From mathematical logic, automatic control, quantum mechanics, artificial 
intelligence, etc., using probability and statistics, the demand for random sequence 
analysis and measurement uses the n variable 0-1 vectors and their linear combi- 
nation cannot meet measurement requirements on various applications. Modern 
measuring methodology and technology need to use permutation and substitution 
operations on different levels of logic foundation to satisfy the frontier measure- 
ments on quantum physics, cryptographies, and artificial intelligence. From a 
measuring viewpoint, the emergence of a new measuring system is urgently 
required to deal with advanced applications. 


Overview of Modern Group Theory 


From a discrete representative viewpoint, every abstract group is isomorphic to a 
subgroup of the symmetric group of some set (Cayley’s theorem) and permutations 
are the core basis in modern group theory. 

The beginning of modern group theory can be traced back to Galois’ contri- 
bution in the 1830s; Klein studied transformation group in the 1870s to propose 
Erlangen program to show the group theory as an invariant structure for symmet- 
rical patterns and transformations. Inspired by Klein, Lie used infinitesimal sym- 
metry transformations to establish a Lie algebra system. 

Using the multiple tuples of variable structures, Hamilton proposed complex and 
quaternion expressions. Influenced by Gordon on invariant formula, Hilbert using 
finite basis constructed a complete system of an algebraic structure on n variables. 
In 1906, an infinite-dimensional Hilbert space of complex variables was developed. 
Based on the series of automorphic functions, Poincare was the first person to 
discover a chaotic deterministic system which laid the foundations of modern 
complex dynamic system, fractal and chaos theory. 

Through Noether’s investigations on Einstein general relativity to determine the 
conserved quantities for every physical laws that possess some continuous sym- 
metry as Noether theorem. A series of studies on invariants and symmetries were 
promoted the development of abstract algebra in the 1930s by refining algebraic 
structures as groups, rings, algebras, fields, and lattices. 

In the 1930s, Weyl established the group theory of quantum mechanics; the 
theoretical basis of quantum mechanics was established based on the symmetry 
operator. Since the 1940s, Hua developed a complex matrix representation under 
symplectic group using the unit circle as the core. In the 1950s, Yang proposed the 
gauge invariance that plays a foundation role in modern field theory. Chern 
established the fiber bundle structure for the differential geometry of the complex 
function. 
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From 1980s, the gauge field theory became the basic mathematical tool of 
modern physics. The eightfold/tenfold way of quark model plays a key role in the 
standard model of particle physics and the exploration of grand unified theory; the 
corresponding group structures are SU(3)/SU(5). 


Brief History on 0-1 Logic Systems 


From the perspective development of mathematical logic, the origin of the modern 
0-1 logic system can be traced back to Leibniz’s invention on binary counting and 
combinatorial analysis in the 1670s. In the 1850s, Boole proposed Boolean algebra; 
in the 1900s, Logic school made logic as the foundation of modern mathematics. 

In the 1930s, Gödel proposed incompleteness theorem to be unprovable in a 
given formal system for Hilbert’s decision problem. In 1936, Turing used infinite 
length of 0-1 sequence with read/write operation to be the Turing machine. Under 
Church’s Lambda calculus, the Church—Turing thesis lays the theoretical founda- 
tion of computable and recursive theory. 

Using 0-1 variables and logic operators, Shannon in 1937 proposed switch 
theory to provide module design, simulation, and implementation bases for modern 
computers and communication systems of technical supports. After more than half a 
century revolutionary development of semiconductor chips, electronic circuits from 
discrete separated components to integrated circuits, and then very large-scale 
integrated circuits, switch theory provides solid foundation on the basic theory, 
application analysis, and design tools. 

Although the modern logic system was original developed from Leibnitz, use of 
permutation modes in state transformations can be traced back ancient time for 
several thousand years ago in oriental history. In the I-Ching system developed 
from the early days, Yin and Yang’s representations are identified as the roots. Five 
thousand years ago, Fu-hsi proposed eight trigrams as an initial set that can be 
represented as eight states of three 0-1 variables. Using modern mathematics, one 
can see that the representations of the three layers of trigrams of Yin/Yang are 
equivalent to the eight diagrams and eight states of three 0-1 variables. Three 
thousand years ago, King Wen of Zhou dynasty proposed another order of eight 
trigrams to be different from Fu-hsi, that is, a permutation of the Fu-hsi group. In 
the 1050s, Shao Yung proposed a balanced binary tree as a natural order of a binary 
system same as the Leibniz binary counting. 

Ancient Oriental philosophers have developed the logical foundation of Chinese 
traditional culture using this Yin/Yang symbol system. However, it must be pointed 
out that subsets of states are contained in this system with various logic paradoxes 
at different levels. This dialectical logic system based on the J-Ching is difficult to 
meet a list of important characteristics in formal logic: consistency, completeness, 
noncontradiction, soundness, etc. 
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Modern 0-1 Vector Algebra 


For using 0-1 vectors and logic operators in vector operation mode, it is a natural 
way to extend parallel bit operations from a single bit to multiple bits. In addition, 
in order that bit operations can be effectively performed on multiple bits, it is 
necessary to implement permutation operations among bits. It is convenient to 
define a pair of bits with a fixed distance and cyclic shift operations on a given 
vector. 

In the 1970s, Lee described cyclic shift operations in Modern Switch Circuit 
Theory and Digital Design. From the formula of vector switching functions, the 
canonical forms of vector switching functions are extremely complex and very 
powerful transformations. 

Associated with the advanced development on block ciphers in cryptography, a 
new vector extension has been developed as Advanced Vector Extensions (AVS). 
Specific development of the new instruction for AES cipher algorithm is AES-NI 
package, which shows the latest achievements for block ciphers. 

Under this type of vector permutation—substitution components, complex cryp- 
tographic algorithms can efficiently perform encryption and decryption require- 
ments under permutation and substitution commands. 


Introduction to Variant Construction 


In the 1980s, the author studied the sorting problem on a vector of N integer ele- 
ments using the symmetric group under 0-1 vector control, and constructed 
high-performance parallel sorting algorithms. Then, smoothly enlarging algorithms 
for Chinese fonts were proposed using logic operations on 2D bitmaps. In the 
1990s, multiple levels of invariants were used to organize a state set as a phase 
space, and the conjugate classification and transformation of binary images was 
established. 

In 2010, a new vector logic system was proposed using two composite opera- 
tions: permutation and complement, to form a new vector logic system: Variant 
Logic. After 8 years of in-depth exploration, the variant construction is composed 
of three core components: variant logic, variant measurement, and variant map. 

Using four meta states, multiple probability and statistical measurements can be 
constructed. By associating these measurements with quantitative expressions and 
combinatorial projections, more than 60 research papers and book chapters were 
published. Relevant contents are covered from theoretical foundation to sample 
applications. Since all these papers are published in various places all over the 
world, it is difficult for readers to systematically collect them for further reading. 
This book is the first one to collect the most relevant papers from theoretical 
foundation to sample applications to organize the variant construction as variant 
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logic, variant measurement, variant map, meta model, and sample application 
systematically. 


The Organization of This Book 


This book is composed of nine subparts in two main parts: theoretical foundation 
and sample application. The theoretical foundation is composed of four subparts: 
Variant Logic, Variant Measurement, Variant Map, and Meta Model. 

Variant Logic describes n variable 0-1 vectors with 2” states which form a 
variant configuration space with 2”!27" members. 

Variant Measurement defines on n tuple 0-1 vectors, four meta measures, and ten 
expansion operators established. 

Variant Map illustrates 2” states and re transforming states, and multiple sta- 
tistical probability distributions are investigated using four meta measures and their 
combinations in higher dimensional distributions. 

Meta Model describes a concept cell model of knowledge representation and a 
multiple probability model on voting. 

The part of ample application is composed of five subparts: Global Visualization, 
Quantum Interaction, Random Sequence, DNA Sequence, and Multi-valued Pulse 
Sequence. In Global Visualization, a list of function maps is used on medical image 
analysis, cellular automata rule space on exhaustive arrangement. In Quantum 
Interaction, conditional and relative probability distributions simulate two paths of 
quantum interactive effects. Random Sequence provides variant random number 
generators, a unified measurement model to handle both pseudo and truly random 
sequences in moder cryptographic applications on variant maps. In DNA 
Sequence, whole gene sequences are mapped on variant maps. In Multiple-valued 
Pulse Sequence, bat echo/ECG sequences are mapped on variant maps. 


Suitable Readers of This Book 


This book includes a wide range of topics from theoretical foundation to sample 
applications. Different parts may be suitable for specific groups. Variant Logic, 
Meta Model, and Variant Measurement are useful for basic researchers on logic, 
probability, statistics, analysis, and measures on mathematical foundation, combi- 
natorial mathematics, metamathematics, quantum logic, and combinatorial group 
theory on levels of researchers and graduate students; Variant Measurement and 
Variant Map are suitable for application researchers and engineers in big data, 
complicated system analysis, feature extraction, artificial intelligence, applied 
mathematics, software engineers, senior college students, and postgraduate 
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students; Variant Map and sample applications are suitable for requirements of 
complex system analysis/design, data engineer, big data engineer, artificial intelli- 
gence engineer, application development engineer, postgraduate, and senior 
undergraduate students. 


Kunming, Yunnan, China Jeffrey Zheng 
April 2018 
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Part I 
Theoretical Foundation—Variant Logic 


I-Ching has three key properties: 1. Simple, 2. Variant, 3. Invariant. 
—Zheng Xuan 


The Monad, of which we shall here speak, is nothing but a simple 
substance, which enters into compounds. By simple is meant without 
parts. 


—Gottfried W. Leibniz 


Quaternions came from Hamilton after his really good work had been 
done, and though beautifully ingenious, have been an unmixed evil to 
those who have touched them in any way. 


—Lord Kelvin 


From a historical viewpoint, the first paper of variant logic foundation (A frame- 
work to express variant and invariant functional spaces for binary logic) was 
published in Frontiers of Electrical and Electronic Engineering in China, Higher 
Education Press and Springer 5(2):163-167 (2010). An extensive book chapter 
(Chapter “A framework of variant-logic construction for cellular automata”) was 
published in the OA book of Cellular Automata—Innovative Modelling for Science 
and Engineering:325—352 (2011) by InTech Press to describe a variant logic 
framework systematically. 

The Part I is composed of two chapters (1-2). 

Chapter “Variant Logic Construction Under Permutation and Complementary 
Operations on Binary Logic” is shown the core construction of variant logic under 
two vector operations (Permutation, Complement) on 0-1 logic. 

Chapter “Hierarchical Organization of Variant Logic” describes complex hier- 
archical organization under variant logic construction to compare with other logic 
systems. 


Variant Logic Construction Under R) 
Permutation and Complementary giecik 
Operations on Binary Logic 


Jeffrey Zheng 


Abstract This chapter presents a binary logic framework whose function elements 
are invariant under permutation and complementary operations. The entire frame- 
work is described using 4 levels of hierarchy: n variables, 2” states, 2?” functions, 
and 2”!2?" logic functionals. Under the proposed framework, it is possible to de- 
termine higher level function complexity by analysing lower levels of organisation 
characteristics. These characteristics can be determined quite accurately because the 
symmetry conditions of variable and state organisations have invariant logic functions 
and a corresponding logic functional organisation. More symmetrical arrangement at 
state level creates more symmetrical permutations within the function space. Lower 
level properties are highly influential on the higher level properties of function com- 
ponents within a logic functional space. The proposed framework provides a logic 
foundation to describe complex binary systems using lower level properties, making 
analysis of systems more efficient and less calculation intensive. Different global 
coding schemes are discussed and typical two-variable cases of logic functionals are 
illustrated. 


Keywords Vector permutation - Complement - Variant logic - Functional space 
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1 Introduction 


Mathematical invariance [1, 2] is key in the understanding and development of new 
scientific theories and technologies [3]. Most scientific theories rely on invariant 
properties of group behaviour and transformations [4] to describe the rules of the 
world we live in. Theories such as relativity and quantum mechanics all rely on 
invariance properties for their constructs [5]. In the field of mathematical logic, 
construction of theoretical frameworks [6, 7] focus upon three hierarchical levels: 
variables, states and function spaces. Boolean algebra and switching theory [8, 9] 
exploit combinatorial invariant properties, and use these foundational properties for 
implementing new theories and applications. 

For reasons of consistency and symmetry of structure, logical operations are 
restricted to two types of canonical forms namely, the product-of-sums and the sum- 
of-products approach. Any complex logic function can be rewritten as these two 
canonical forms. The use of a truth table enables analysis and the transformation into 
the canonical representations [6]. 

Following the introduction of Conway’s Game of Life [10], Stephan Wolfram 
from the 1980s [11, 12] started to apply Boolean algebra to describe the behaviour 
of Cellular Automata. His approach used a binary counting sequence to naming 
different rules of behaviour based upon the functions generating the next iteration 
in the game. Wolfram identified four classes of transformations within the rules of 
Cellular Automata (CA). Results of findings are published in his book [13]—“A New 
Kind of Science”. The main method of analysis in this area of research chooses a CA 
operation, recursively applying the operation to different initial conditions to find 
emergent patterns from the process. This approach creates many interesting results 
that can be visually identified [14, 15]. 

In the analysis of dynamic systems, it is essential to identify transformation spaces 
with functional invariance [16, 17]. An example in physics is phase space [2]. The 
phase space plays an essential role to describe key properties of a given dynamic 
system. Phase characteristics are more difficult to construct under a logic framework. 
A mechanism for linking lower level characteristics with higher levels properties 
such as symmetry currently does not exist. Under combinatorial logic, different 
permutations add no additional information to access information in phase space 
[14]. 


1.1 Western and Eastern Logic Traditions 


Beginning with Aristotle (384-322 B.C.), the foundations of Western logic have 
played a key role in the development of today’s global society [18]. The modern 
theory of logic systems comprise of a series of outstanding individuals and their 
contributions to the theory of logic: G. Leibniz and the introduction of the Binary 
Number System (1646-1716) [19, 20]; G. Boole and the development of Boolean 
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Logic (1854) [21]; G. Cantor and Set Theory (1879); G. Frege and Conceptual 
Logic (1879) [22, 23]; B. Russell and Russell’s Paradox (1910) [24]; J. Lukasiewicz 
and Multiple-Valued Logic (1920); D. Hilbert and Foundations of Geometric Logic 
(1923) [25], K. Gödel and his Incomplete Theorem (1931) [22], A. Turing and the 
Turing Machine (1936) [26]; C. Shannon and Switching Theory (1937) [27]; H. 
Reichenbach and Probability Logic (1949) [28]; as well as L. Zadeh and Fuzzy 
Logic (1965) [29]. Development of such theorems and mathematical frameworks 
have enabled Western culture to understand the operation of our world as a set of 
implementable rules. Logic and the development of rules for the expression of logic 
have provided a language that enabled the construction of today’s scientific societies. 

In contrast to the binary on-off nature of Western logic, Oriental culture have been 
influenced by spiritual traditions of balance and harmony. The theme of balance can 
be summarised in the I-Ching or ‘The Book of Changes’, one of the most influential 
books of classic Oriental literature [30-37]. The concept of Yin and Yang forces 
and the subtle interplay of the two opposing forces yield combinations and permu- 
tations of change. Orient philosophy believed that ‘the only constant phenomena is 
change’ and such a worldview emphasised the dynamic nature of a system; rather 
than focusing on the individual states of a system (on, off), prominence was instead 
placed on operations that yield change (on to off, off to on). The structure of thought 
introduced by the I-Ching allowed change to be systematically documented and anal- 
ysed. Complex interactions, cyclic behaviour and the interplay of nature at all levels 
of oriental culture—sociology, literature, medicine, astrology and religion—were 
able to be described using the tools of dynamic logic provided by the I-Ching; the 
framework remains a complete philosophy as well as a universal language and has 
remained unchanged over the past two thousand years [38]. 

Leibniz in as early as 1690 realised that the balanced yin—yang structure proposed 
by Shao Yong (1050) was equivalent to the binary number system [33, 38]. However 
the Western scientific community have mostly disregarded the I-Ching; due mainly 
to cultural and language barriers as well as local superstitions that cloud the essence 
of the framework. In its ancient form of allegories and metaphors, the I-Ching is 
unable to satisfy the logician’s requirement for completeness, consistence and other 
such properties. The challenge then is to be able present this philosophy for modern 
times, in the language of mathematics. Stripped of its colourful language, what 
insights does this ancient system contain? What are the essential differences between 
modern binary logic and the I-Ching’s dynamic binary structures? The unification 
of these two schools of thought would bring greater understanding of the world we 
live in [35]. As the modern formulation of Cellular Automata generates complexity 
through binary logic whilst the I-Ching analyses complexity though binary logic, the 
modern language of the I-Ching can be found in the creation of a structural definition 
of CA. 
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1.2 Logic and Dynamic Systems 


In the field of mathematical logic, construction of theoretical frameworks focus upon 
three spatial hierarchies: variables, states and function spaces [6, 7]. Boolean algebra 
and switching theory exploit such properties, using the combinatorial invariance 
of the framework for implementing new theories and applications [8, 9]. Logical 
operations are restricted to two types of canonical forms, namely the product-of-sums 
and the sum-of-products approaches. Any complex logic function can be rewritten 
as these two canonical forms. This is done for reasons of consistency, simplicity 
and symmetry of structure; as such the use of a truth table enables analysis and the 
transformation into the canonical representations [6]. 

In the analysis of dynamic systems, it is essential to identify transformation spaces 
with functional invariance [16, 17]. The Ising model is arguably the simplest binary 
system that undergoes a nontrivial phase transition [14]. In modern physics, this 
type of model uses a structure linked to phase space representation of a dynamic 
systems [2]. The phase space plays an essential role to describe key properties of any 
dynamic system, however under classical logic, phase characteristics are difficult to 
construct. A mechanism for linking low-level representations such as variables and 
states with higher level group properties such as symmetric conditions currently does 
not exist. This is more a limitation of the language and the operations allowed by the 
language. Classical logic is based on static combinatorial structures. Permutations, 
which are intrinsic to phase space, cannot be expressed under such a framework 
of classical combinatorial logic [14]. Cellular Automata frameworks [39], however, 
are fully dynamic and have been used to describe phase space [2]. Inspired by the 
traditional I-Ching hierarchical structures, new conditions, operations and relation- 
ships have been proposed on top of the Classical Logic framework to incorporate 
the dynamic nature of CA. The additional constructs provide support for CA using 
framework that is logically consistent and complete [40]. 

The [40] proposal builds upon earlier studies of logic systems from a structural 
viewpoint. Kunii and Takai [41] applied a n-cell structure for analysis, classification 
and generation of visual objects using topology and homotopy tools in computer 
graphics [42-46]. Zheng and Maeder [47] proposed a balanced classification on 
binary images for conjugate classification and transformation of binary images on 
regular plan lattices in 1990s to visualise different configurations [15, 48-50]. All 
such work used partial constructs of the [40] framework. The proposed framework 
supports classical logic, vector permutation and complementary operations. The new 
construction requires five spatial hierarchies containing 2?” x 2”! functional config- 
urations for any n variables. This structure is much larger than classical logic having 
three spatial hierarchies supporting 2?" functions for n variables. Newly defined sym- 
metric properties play an important role in predictions and classifications of possible 
recursive results. Using such properties, global behaviour can be identified and clas- 
sified. A disadvantages of the new framework lies in its extreme complexity. It is 
possible to use parallel computers to do analysis of the configurations contained by 
n = 3 (the space already includes more than 10’ configurations). It is impossible 
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using today’s technology to process the n = 5 space due to the extreme growth of 
structural complexity (23? x 32! configurations). 

This chapter describes a logic framework, using invariant characteristics of per- 
mutations and complementary operations to identify an invariant structure under such 
mixed operations. This allows the definition of a phase space to be introduced into 
logic. The transformation does not change the relevant function space. A proposed 
2D representation provides additional properties to predict different behaviours from 
permutations that influence higher level structures in a logic functional space. 


2 Truth Table Representation for a Logic Function Space 


The proposed framework describes three levels of a logic function space and the 
truth table representation of the space. 


2.1 Basic Definitions 


f:X>Y; Y=f(X); X,Ye By 
X = Xn-\|Xy-2... Xj... X1 Xo, Y = Yy_-1Yn-2...Yj...VYi Yo (1) 
Xj, Yje Bp, 0Sj<N 


An example of a transform: the sequence X = 0001110100, N = 10is an input for a 
function operation f, the output is a sequence of the same length Y = 1101011001; 
X,Y € BP. 


Definition 1 Let... X;... be an bit structure: 


Xp. = Xn-1Xn-2 . . . Xi . .  X1X0 = X (2) 
O<i<n,O0O<j<N,xe Bs 
where X; = x; is a corresponding position. 
Y; = fK. ate Xi ae .) = f (Xn-1Xn-2 E aea .X1X0) = fa) (3) 


In Boolean logic, n variables correspond to a full truth table with 2” x 2?" entries. 
The /th meta-state 0 < J < 2” has n-bit number to occupy the /th column position, 
the Jth function T(J) has the Jth row with 2” bits 0 < J < 2”, the function value 
of the /th entry is determined by T(J);. The full table can be represented as follows 
(Table 1): 
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Table 1 Truth Tables of n-variables 


O<1<2") Sony ... SI iss Sy So 
[y-1.--Gj. do | 1... 1.1... Gy... Gp... 0...0...1 0...0...0 
0<J <2] Jaa Jr Ji Jo 
T (0) (0) (0) 0 0 
T(1) 0 0 0 1 
T (2) (0) 0 1 0 
T(J) Jon} oa Ji m A Jo 
To =o) 1 a 1 ox il 0 
TOZ = E a 1 s l 1 
Method 1: Process Method of Truth Table 


Input: x : n variables in a {0, 1} sequence, J: selected function number 
Process: Using the input sequence x, the meta-state number / is to select 
the 7-th column of function T(J) 
Output: Return T(J)r’s value (1 for true and 0 for false) as output. 


2.2 Permutation Invariants 


Proposition 1 Sequential Mapping Under sequential order, T(J) = J. 


Proof The relevant output entries of T (J) are mapped to the binary number J having 
2” bits: 
T(J) = T (Sor_1(Jor_1)) ... T (S17 (J)... T (So (Jo)) 


= T(J)... T(r... To = J € BY (4) 
Tr = T(S1(JDÐ) = Jı € B0 < I <2 0<7 <2” 


Definition 2 For any n binary logic variables, let 2 (N) be a symmetric group with 
N elements and P be a permutation operator, P € 2 (2”), then for any J, 3K, J, K € 
B; P(T(J))=K,0 < J, K < 2”, the following permutation can be represented 
in Truth Table form: 


P:J—> K 
P(T (J)) = P(T (Sx-1(J2-1))) -.- P(T (Si (J1))) - - - P(T (So(Jo))) 
= P(T (J)x-1)... P(T(J)7)... P(T (J)o) 
= Kmi... Kr... Ko = K € B? (5) 
P(T(J)7) = P(T(S7(J7))) = T (Spy (Jey) 
= T(J) pay = Jea = Ki € B2 
O21 22) 025, Kk <2”, P € R2”) 
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Proposition 2 The Truth Table under permutation operation on 2” meta-states can 
generate 2"! sequences for 2” length of integers. 


Proof For any P e 82(2"), 2” are independent, it is composed of £2(2”) 
elements. a 


For the one-variable condition (i.e. n = 1), there are only two possible arrange- 
ments. The initial sequence is represented as S = S1 Sọ = 10, and a permutation 
operation generates the output P (S) = SoS, = 01. The following shows two groups 
of results: 


Mate-state} S| 1 0| P(S)| 0 1 
Function J P(J) 
0 0;0 0 0 0 O 
x 1/0 1 2 1 0 
x 2/1 0 1 0 1 
1 3/1 1 3 1 1 


For any permutation operation, the function T(J) = P(T (J)) is always invariant. 
The inequality J A K = P(J) holds in general. 
3 Fourth Level of Organisation 
Building upon the three levels (variables, states and functions), a fourth level of 
organisation is introduced. 
3.1 Complementary Operation 


Definition 3 Complementary Operator, for any binary (0-1) variable y € Bo, let the 
relevant index ô € B2 be a complementary operator: 


yo (6) 


Definition 4 Complementary Function Operation, for any n variable function of 
2” meta function vectors S = Son—1 ... S7 ... So Let A = dm_1...6,;...569,0 <I < 
2", ôr € B2, A € BS. 


For this type of complementary operations on function, A is 
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A:T(J) > K;J,K € B?,0< J, K <2” 
S^ = Se... S% SÈ, Sy € B} 
TD) = SD) T(S? OD) PG 
= T(R. TÈ... TÈ (7) 
= Kmi... Kı... Ko = K € B? 
T(DË = T(S? (JD) = J} = Kr € B 
0<I<2"”,0<J,K <2”, cA 


3.2 Invariant Logic Functions Under Permutation and 
Complementary 


Definition 5 Permutation and Complementary Operations. For any of the n variables 
expressed as 2” meta vectors, Complementary Operations A € B? and Permutation 
Operations P € 9 (2”) are expressed as 


(P, A): T(J) > K; J, K € Be, P € Q(2"), A € B? 
P(T(J)4) = P(T (S J1) -PSF (J). P(T (S3°(Jo))) 
= P(T (D)... P(T (DË)... P(T (A) 
= Kmi... Kı... Ko = K € B? 
PE) = POED) = JRE = Ki € Ba 
0< I <2"”,0<J,K <2”, P € R"), cA 


(8) 


3.3 Logic Functional Spaces 


Theorem 1 (Logic Function Invariants under Permutation & Complementary Oper- 
ations) For any logic function, the output of Method 2 provides an equivalent output 
as the original Truth Table under all conditions. 


Proof A Jth row on the permutation and complementary table of P(T“) for any 
I € B}, J € B? is constructed by 


=T(J); Spay) =0 
PITA) = TEO = 9 
C rcp, gad O) 


Variant Logic Construction Under Permutation and Complementary ... 11 


Counting Order 7 6 5 4 3 2 1 0 

S)111 110 101 100 011 010 001 000|Binary counting 

00 0 0 0 0 0 ©ỌO 0O fa full 0 vector 

Aj 1 1 0 0 1 1 ©ỌO O JaA- vector 

=A} 0 0 1 1 0 O 1 1 Janot A- vector 
11 1 1 1 1 1 1 1 Ja full 1 vector 
T(178)} 1 O 1 1 0 O 1 O [intial value 
T(178)! 1 O 1 1 0O 0 1 O |T(178) Truth 
T(178) 4 O 1 1 1 1 1 1 O |T(178) A-Variant 

T (178)? O 1 0 0 1 1 =O 1 |T(178) False 
T(178)į|1 0 0 0 0 0 O 1 |[T(178) A-Invariant 


Method 2: Permutation and Complementary Methods Table P(T4) 
Input: x: n variables in a binary {0, 1} sequence, J: is the selected function number, 
P € Q(2”) and A € B?” are Permutation and Complementary operators 
Process: Input sequence x is established, the P(/)-th column is selected using the 
meta-state number 7. This represents the J-th column of the function P(T (J) 


ô 
Output: If ôp( 1) = 1, return the value of T(J) PU) (1 for true and 0 for false); 


ô 
if Opi) = 0, return =T (J) pO- 


*) 


After using Method 2, the results are shown: 


=>T (J) =T(J); Spay =9 


P(T(J)?) = 
= T Spay = 1 


(10) 


Theorem 2 (Permutation Group for Meta Function Vector) For 2” meta function 
vectors, a total of permutation numbers is 2"). 


Theorem 3 (Permutation & Complementary Structure) Under permutation and 
complementary operations, a total of 2"!27" permutations can be generated to form 
a logic functional space for the n variables. 


4 Different Coding Schemes: One- and Two-Dimensional 
Representations 


The initial step to construct a series of logic functionals. Permutation and com- 
plementary differences can be shown in the proposed invariant function structures. 
Different coding schemes under different symmetric restrictions are established. Four 
schemes are described, in which one of them is in one-dimensional representation 
and other three schemes are two-dimensional representations. For binary sequences 
in sequential counting order, the scheme is known as the SL (Shao Yong & Leibniz) 
coding scheme. 
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4.1 GCoding 


The General Code (G) is used to map permutation & complementary operations. For 
any state in the G coding scheme having 2” bits, 


G:(J, A, P) > K; J, K € BP; A e BY, Pea. (11) 


4.2 W Coding 


From the G coding scheme, their bit numbers are separated into two equal parts in 
the same bits to form a 2D representation. This mapping mechanism can represent a 
function space as a W coding scheme. 


W : (J, A, P) > K = (J!|J® 


i p= (12) 
J, K € BŽ; J!, Pep is S? €S, AER? PEO 


Under this representation, a given logic functional for the function space is illustrated 
as a fixed matrix. 


o0 J...) (ly?) J...) 2T -= 1) 


TO =) hg | ri) | ey) E 


(2277 — 110)... We =the =12? =i 


027 fae 0< J <2” 
In the one-variable condition, there are eight cases in their logic functional spaces 
as follows: 


f| IT wl] Plav wl] ys atv wl] Or W 
of o0 (0/0) 2 (10) I OD 3 a 
x} 1 OD 3 (1|1) 0 (olo) 2 4110) 
x| 2 j0 0 (010) 3 (1j1)) 1 (01) 
1} 3 ali) 1 (0|1) 2 (1/0), o (OO) 
f/PI"),T W P0”), A-V W [PU™),A-IV W [P(),F W 
of o0 O0 1 (0[1) 2 (joy) 3 am 
x} 2 (0 3 (1|1) 0 (ojo) 1 (0/1) 
x| 1 (O|1) 0 (010) 3 aji 2 (110) 
1} 3 ali) 2 (110) 1 (0/1), o  (0l0) 
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For better visualisation and expression, the one-dimensional G coding scheme is 
converted into a two-dimensional W coding scheme. 


Truth A-Variant Truth A-Variant 
Ox\x 1 Ox\x 0 
x 1\0x x Hax 
ies x O|l x Ai xlix 
1x/|x 0 0x\x 0 
A-Invariant False A-Invariant False 


4.3 F Coding 


Using 2D representation, symmetric condition can be added to arrange meta-states 
into specific order. For each pair of states in W, if they satisfy following condition, 
then a refined code: F coding scheme is determined. 


a 
= 


F coding scheme 


J! the Ith meta-state J? the Ith meta-state 
$ 


Xes! 


— 
= 


Xes? 


4.4 C Coding 


In addition to a pair of states in complementary relationship, further structure is 
introduced onto F code. When the pair of states in F have the same values in their 
ith position, they form a C coding scheme. 


— 
= 


C coding scheme 


S! the Ith 


f 
Vx; € S!, x; = 1(0) 


S? the Ith F coding scheme 
¢ + 
Vx; € S°, x; = 0(1) general conjugate 


pany 
— 


The C coding scheme, have the strongest symmetric conditions available. Only 
a relatively small number among the three invariant groups can be identified within 
this scheme. 
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5 Two-Variable Cases 


Four groups of the proposed schemes are selected as examples. Each group of a 
logic functional represents 16 logic functions as 4x4 images. 4 groups are arranged 
as 2x2 blocks to arrange as Truth/False, A-Variant/A-Invariant properties. The 2x2 
blocks correspond to: 
Truth Block | A-Variant 
A — Invariant] False Block 


. Each block contains 16 entries of function images as a 


4x4 (2? x 2?) configuration. Each image entry denotes a transformed number and its 


(J'|J") 


function number in the form: where K = (J'|J°) is a transformed number 


and J is the function number. In all four figures, (a) 2x2 base blocks to represent 
function images and (b) 2x2 vector blocks to represent relevant coding schemes 
respectively. 

In Fig. 1, the counting order of meta-states has been arranged as W coding (SL 
code): P = (3210), P(A) = 1010. In this group, only Functions 6 and 9 can be 
observed in complementary symmetric condition in main diagonal direction. 

In Fig. 2, variation the configurations among W coding: P = (2301), P(A) = 
0101 creates similar effects seen in Fig. 1. 

In Fig. 3, the F coding scheme is shown: under this configuration, P = (2310), 
P(A) = 0110. Six pairs (0:15, 1:7, 2:11, 4:13, 6:9, 8:14) of complementary func- 
tions can be identified. The group has four blocks containing the same pairs of 
configurations. 

In Fig. 4, C coding has represented: P = (3102), P(A) = 1100. In addition to 
six pairs as same as F coding, four corners are 4 functions (0, 5, 10, 15) in all blocks. 
This makes most regular structures compared to all other coding schemes. 
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Fig. 1 W coding (SL code): P = (3210), P(A) = 1010; a 2x2 base blocks b 2x2 vector blocks 
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alll il 
| 


oN 
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(b) 


<3,0> <3,1> 
12 14 


Fig. 2 W coding: P = (2301), P(A) = 0101; a 2x2 base blocks b 2x2 vector blocks 
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(b) 
<1,0> <1,1> 
8 9 


<2,0> <2,1 > 
4 5 


<3,0> <3,1 > 
12 13 


Fig. 3 F coding: P = (2310), P(A) = 0110; a 2x2 base blocks b 2x2 vector blocks 


A 
M [i 


À- 


(b) A F <0,2> <0,3> <0,0> 
1 10 


Fig. 4 C coding: P = (3102), P(A) = 1100; a 2x2 base blocks b 2x2 vector blocks 
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6 Conclusion 


It is shown in this chapter that the arrangement of binary function space using four 
levels of classification can be used to add symmetry and regular structure onto the 
entire space of binary functions. For ease of visualisation, it is convenient to apply 2D 
representation mechanism that enables symmetric configurations of the system to be 
analysed via different coding schemes. Binary functional spaces provide additional 
optimal information to generate large numbers of potential configurations in order 
to arrange and organise logic phase spaces. 

The mechanism can be developed further to establish a solid logic foundation on 
logic functional levels for theoretical explorations and practical applications. We aim 
to make refined investigation on different coding schemes within the highest levels 
of organisation in our future work. 


Acknowledgements Thanks Mr. J. Wan for generation all sample images and configurations and 
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University. 
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Abstract In modern logic, various systems have been proposed extending classical 
Boolean logic & switching theory. Such logic frameworks include multiple-valued 
logic, probability logic, fuzzy logic, module logic, quantum logic and various other 
frameworks. Although these extensions have been applied to many applications in 
mathematics, in science and in engineering, all extensions to Boolean logic invali- 
dates at least one of the six fundamental rules of Boolean logic shown in L1 to L6. 
We propose a new framework of logic, variant logic, extending Boolean logic whilst 
satisfying the six fundamental rules (L1—L6). By defining the Variant—Invariant be- 
haviour of logical operations, this framework can be constructed using four types of 
general operators. Main results of the chapter are summarized in Theorems 8-10, 
respectively. To show significant differences between classical logic and new variant 
logic, invariant properties of this hierarchical organization are discussed. Simplest 
cases of one-variable conditions are illustrated. Variant logic can provide the nec- 
essary framework to support analysis and description of Cellular Automata, Fractal 
Theory, Chaos Theory and other systems dealing with complexity. Such applications 
of this framework will be explored in future papers. 
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1 Laws of Logic Systems 


1.1 Laws in Classical Logic Systems 


Classical logic identifies a class of formal logic that are characterized by a number 
of properties [1-17]. 


Definition 1 For any logic system if all CL1-CL5 are satisfied, then it is a classical 
logic system. The five properties of classical logic (CL1—CL5) are listed as follows: 


CL1: Law of the excluded middle and double negative elimination 
CL2: Law of non-contradiction 

CL3: Monotonicity and idempotency of entailment 

CL4: Commutativity of conjunction 

CL5: De Morgan duality 


Examples of such classical logic systems include works of philosophy and religion 
(Aristotle’s Organon; Nagarjuna’s tetralemma; and Avicenna’s temporal modal logic) 
as well as foundational logic systems such as reformulations by George Bool and 
Gottlob Frege [4-17]. These properties can be rewritten as simplified equations 
describing basic properties of a logic system using characteristics of the five classical 
properties. The following equations (L1—L6) describe such a system. 


L1: P U P = P Idempotency 

L2: PNP=P... 

L3: =P U P = P Excluded Middle 
L4:~7PNAP =P... 

L5: =—=P = P Double Negative Elimination 
L6: P,P > Q 


The set of equations can be applied in the analysis of modern logic systems 
to determine if they are all satisfied. The equations will be defined as canonical 
properties and a logic system satisfying all six properties will be defined as a canonical 
system. If any logic system does not, they are categorized as non-canonical. 


1.2 Current Logic Systems 


Many modern logic systems cannot satisfy the six canonical properties. Three-valued 
logic proposed by Luckasiewicz 1920 can satisfy L3-L6, cannot satisfy L1-L4. 
Probability logic proposed by Reichenbach 1949 can satisfy L5—L6, cannot satisfy 
L1-L4. Fuzzy logic proposed by Zadeh 1965 satisfy L1, L2, L5, L6, cannot satisfy 
L3-L4. Since they cannot satisfy canonical properties, they are all non-canonical 
logic systems [1-22]. 
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2 Truth Valued Representation in Boolean Logic Systems 


For any n-variable Boolean logic system, it is natural to establish 2” states. Under 
either selected or not selected operation, it can be building up a truth table for a given 
Boolean function. Collecting all possible selections, a full truth table is constructed 
in 2” columns and 2?" rows in presentation. We can list this table as follows: 


O0<1 <2" (27-1... I w d 0 
O<i<n JL Lido. Ipepedj.Io .. 0...0...1 0...0...0 

0<J <2" 

0 0 . 0 a SO 0 

1 0 0 0 1 

2 0 0 1 0 

J Jini Ji we A Jo 

2?" —2 ii. 33 1 we Í 0 

2" 1 Ts 1 a Í 1 


where there are three parameters: i, I, J :O0<K<i<n,0< I <2"”,0<J< 2” cor- 
responding to variable, state and function numbers, respectively. Under such con- 
ditions, for any J, it is convenient to use Karnaugh map or relevant logic tools to 
construct the given Boolean function in combination [6-17]. 


3 Cellular Automata Representations 


Cellular Automata—CA uses a different mechanism [23—35] to represent a given 
function. In a one-dimensional form of CA, a N-length binary sequence is 


X = Xy_-1|Xy-2... Xj X Xo 0S j <N Xj e {0,1} = By 
For a given function f, the output sequence is defined as follows: f : X > Y, Y = 
F(X), 
Y = Yy- Yn... Yj... Yi Yo, 0KS J < N.Y; € By 


It is feasible to use a moving window with a fixed length n to separate X into a local 
kernel in length n. The kernel can be presented as 


[... Xj...) = Xp_y... Xp... XO, Xj € Bp. 


For a given function f 
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It is necessary to assign a certain position 7 in the kernel for special care to associated 
with j position of both sequences. We have 


y= fQn-1.-..Xj.-.X0) = ff... Xj...) == Y; 
or Xj = XT ', Yj = X’ i.e. 


fix > X, XT, X, € By 


4 Variant Construction 


4.1 Four Variation Forms 


Considering f : x! => xi for any function of Boolean logic system to analyse 
their variation properties [36—40], it is normal to have following proposition. 


Proposition 1 For any f : X m >X ‘ transformation, four forms of transforming 
classes are identified: TA:0>0,TB:0>1,TC:17>0,TD:1—>1. 


Proof X;, Y; are 0-1 variables, only four classes listed are possible. | 


Definition 2 Four transforming forms are corresponding to following sets: TA: In- 
variant class for 0 value, TB: Variant class for O value, TC: Variant class for 1 value, 
TD: Invariant class for 1 value. 

Under such definition, the following proposition can be established. 


Proposition 2 Using four classes of transformation, four variant operations are 
defined. 


Type| X ; — Y;|Truth] Variant| Invariant] False 
TA | 0 0j 0 0 1 1 
TB | 0 1j 1 1 0 0 
TC | 1 0| 0 1 0 1 
TD | 1 1} 1 0 1 0 


Proof Truth (False) values are determined by Y;(¥;) and Variant(Invariant) values 
are determined by {TB, TC} for 1(0) and {TA, TD} for 0(1) respectively. | 


Theorem 1 Jn { Truth, Variant, Invariant, False} groups, only two pairs of groups: 
{Truth, False} and {Variant, Invariant} satisfy L1-L6 to form a canonic logic system. 


Proof Both groups are composed of 0-1 variables, in addition, Truth/False, Vari- 
ant/Invariant are formed complement relationships. Other combinations contain com- 
mon parts, it is not possible for them to satisfy logic canonic conditions L1-L6. W 
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Definition 3 Sequential number of binary is defined as SL coding to remember Y. 
Shao and Leibniz contribution [41—49] on binary logic. 


Definition 4 The operator BN : J — B converts an integer to its binary represen- 
tation. The operator DC : B — J converts a binary number to its decimal represen- 
tation. 


Definition 5 The SL coding scheme is an ordering of binary table outputs T : B?" > 
J. An element J; € SL at position J, where 0 < I < 2” represents function T; such 
that the binary representation of Ty is defined as 


BN(J) = Toril]... Trlr] ..- Tol Jo] 


For any n variable structure, J is composed of 2” bits to represent 0 < J < 27" 
numbers. 


Definition 6 A G coding scheme is defined as an ordering of binary table outputs T : 
Bz’ —> J. An element J; € SL at position 7 where 0 < J < 2” represents function 
T; such that the binary representation of 77 is defined as 


G={VJ|T(J),0< J < 27}; 
T(J) = Taa [Y Jovi)... THY (JD)... ToL¥ (Jo)], 0 < 1 < 2" 


Where {Y (J1), 0 < I < 2”}are2”" length 0-1 vectors, Y(Jan_1) Æ ... Æ Y (Jr) Æ 
... Æ Y (Jo), respectively. 

Under G coding scheme, ordering number is an integer sequence with 27" po- 
sitions. Different transformations will make this sequence extremely complex. In 
convenient to do representation, a two-dimensional W coding scheme is proposed. 


Definition 7 A W coding scheme is defined as an ordering pair of binary table out- 
puts T : B? — (J'|J°). Each component is composed of 2”~! bits in representation: 


(JI?) = Taa [Y Joi). THY OD -+ ToLY Jo), 0 < 7 < 2" 
J? = {VI|BN(J;mod2""!),0 < I < 2"7!} 
J! = {(VI|BN(J;mod2""!),2""! < I < 2”} 
Under this construction, a G coding scheme is transformed into a W coding scheme 
to represent two-dimensional structure for different permutation results. In general, 


J? represents lower 2”! bits and J! represents higher 2”~! bits, respectively. A 
. . n—1 n—1 . . . 
general structure of W coding isa 2?" x 27" matrix shown in the following figure. 
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(010) [0179 .. | (0/22"" — 1) 
(0) cal UTI = wae =i) 
(2 —10)|... g — 1|J® f... Ta — 1/277 — 1) 


0< J’, J! < le {(J"|J7°)}: 2D Space for 27" Functions 


4.2 Complement and Variant Operators 


Definition 8 In B}, the generalized complement Y2, Q € B?" of a variable Y is 
defined to be the element obtained from complementing the components of Y ac- 
cording to the value of corresponding component of Q; Y; is complemented or 
un-complemented if Q; is O or 1, respectively, where Y; and Q; designate the Ith 
component of Y and Q. 


For example, given B3 for Q = {0101, 0110} are as follows: 


Y 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 
y0 [1010 1011 1000 1001 1110 1111 1100 1101 0010 0011 0000 0001 0110 0111 0100 0101 
Y10) 1001 1000 1011 1010 1101 1100 1111 1110 0001 0000 0011 0010 0101 0100 0111 0110 


To apply Q operator on 2” meta vectors, a vector family can be generated. 


Proposition 3 In B3", generalized complement operator Q € B?" has 2” different 
cases. 


Proof Q is a 2” bits vector, each position can be selected as 0 or 1, so a total of 
selections is equal to 27”. | 


Definition 9 For 2” meta states composed of vector W, the ith vector ¥ (i), 0 < 
i < n has 2” bits. Four vectors: {0, W (i), ~W (i), 1} in 2” bits can be selected as Q 
operators. This special form of Q type operations is defined as QV operation. 


Proposition 4 For a QV operator, QV € {0, W(i), ~W(i), 1}, four QV vectors 
provide following complement results respectively in transformation: 


0 : False Operator 
1: Truth Operator 
W (i) : Invariant Operator 
=Y (i) : Variant Operator 


Proof 1 operator keeps original truth table values; 0 operator reverses all values; W (i) 
operator makes invariant condition and —W (i) operator generates variant property. 
a 
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Proposition 5 Undertaken QV operations, 2"*' cases are generated as a comple- 
ment variant group. 


Proof Only 0 <i < n selected, each position have two selections associated with i 
plus two constant vectors. So a total of 2 x 2” = 2”! cases can be generated. W 


Definition 10 For 2” meta vectors Y, its Ith component Y (1) € B? , Y (I) has 2?" 
bits. A permutation operator P makes the 7th component into P (/)th component for 
VI, O < I < 2”, respectively. 


Proposition 6 Undertaken P operation to 2” meta vectors in Y, a total of 2"! per- 
mutations can be generated. 


Proof P operator is equal to permutation on 2” integers. This generates a symmetric 
group contained 2”! members. E 


Proposition 7 Undertaken Q and P operators in Y, a total of 2” - 2"! cases can be 
created. This creates a Complement Permutation Structure—CPS. 


Proof Q and P operators are independent of each other. Their results can be multi- 
plied together. E 


Proposition 8 Undertaken QV and P operators in Y, a total of 2"*! - 2"! cases can 
be created. This creates a Complement Variant Structure—CVS. 


Proof QV and P operators are independent each other. Their results can be multi- 
plied together. E 


4.3 Other Global Coding Schemes 


Under QV + P and Q + P operations, more coding schemes can be defined. 


Definition 11 The F coding scheme is defined as a subset W. For any W code, if any 
two meta state can be paired, such that Y jı, jı — 2"! = jg, 0 < jo < 2"! < ji < 
2”, I; = Ij, indicate state J;, be J;,’s complement. 


F coding provides restricted pair conditions to the structure. Its corresponding 
forms are as follows: 


J! j-th meta state = J? j-th mate state 
+ F coding base t 
X = X 


Definition 12 A coding scheme satisfies general conjugate condition if YI; € Ijo, 
for the selected position i, Va; € Ip, a; =0,0 <i <n. 
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In other words, the general conjugate condition makes selected position on lower 
part in 0 valued and higher part in 1-valued, respectively. 


Definition 13 The C coding scheme is defined as a set of the F coding whereby 
VI; € Ijo, for the selected position i, Va; € Ip, a; =0,0<i <n. 


C coding provides more strong restrictions to separate all 0-valued meta states in 
lower part and all 1-valued meta states in higher part. 


J! j-th mate state = J? j-th F coding 
+ C coding base f + 
Vx; € J',x, = 1 = Wx; € J°, x; = 0 General Conjugate 


Some coding samples are listed in following table: 


No.| 7 6 5 4 3 2 1 O |Normal sequential number 
SL/111 110 101 100 011 010 001 000|Ordering sequence 
Truth} O 0 O 1 1 1 1 O |G: J = 30; W: (1|12) 
Variant} 1 1 0 1 0 O 1 O |G: J = 210; W: (13|2) 
W111 110 010 011 001 000 100 101 |General Conjugate, without pairs 
Truth| O O 1 1 1 O 1 O JG: J = 58; W: (3|10) 
Variant} 1 1 0 0 1 O 1 O |G: J = 202; W: (12|10) 
F|111 110 101 100 000 001 010 011 |Meta states in pairs 
Truth 0 0 O 1 O 1 1 1 IG: J = 23; F: (1|7) 
Variant} 1 1 0 1 0 1 0 0 |G: J = 212; F: (13]4) 
C|111 110 010 011 000 001 101 100|General Conjugate + pairs 
Truth}O O 1 1 O 1 O 1 JG: J = 54; C: (3]5) 
Variant} 1 1 0 0 O 1 O 1 |G: J = 197; C: (12|5) 


4.4 Sizes of Variant Spaces 
Definition 14 Under QV + P operations, W, F and C coding schemes are defined 
as WV, FV and CV coding schemes, respectively. 


Theorem 2 For a W coding scheme of n variables, it has a total of 2” - 2"! cases 
distinguished. 


Theorem 3 For a WV coding scheme ofn variables, it has a total of 2"*' - 2"! cases 
distinguished. 


Theorem 4 Fora F coding scheme ofn variables, it has a total of 2’ - 2277! , sae 
27”0+1/2 . 27-1) cases distinguished. 


Theorem 5 For a FV coding scheme of n variables, it has a total of 2"*' - os 
arly = 22"4"+1 . 2"-1! cases distinguished. 
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Theorem 6 For a C coding scheme of n variables, it has a total of 2” -2"~"\ cases 
distinguished. 


Theorem 7 For a CV coding scheme of n variables, it has a total of 2"*' .2"~'! 
cases distinguished. 


Using definitions of different coding schemes, shown in various sequences of one 
variable cases in the following table: 


Function|Truth W coding] Variant W coding|Invariant WV coding|False WV coding 
0 0 (0|0) 2 (110) 1 (011) 3 (1{1) 
X 1 (011) 3 (1{1) 0 (010) 2 (110) 
x 2 (110) 0 (010) 3 (J1) 1 (O}1) 
1 3 (1{1) 1 (O}1) 2 (110) 0 (010) 
0 0 (010) 1 (O}1) 2 (110) 3 (1{1) 
X 2 (110) 3 (1{1) 0 (010) 1 (O|1) 
x 1 (0|1) 0 (010) 3 J1) 2 (110) 
1 3 (1|1) 2 (110) 1 (011) 0 (010) 
using 2D W coding to arrange 1D sequences into 2D matrices: 
Truth Variant Truth Variant 
TIX 0 
Original: z l = Permutation: - : 
1 x|x 0 0 x|x 0 
Invariant False Invariant False 


5 Invariant Properties of Variant Constructions 


It is interesting to notice that under QV operations, there are 2n + 2 vectors avail- 
able to generate QVS. This makes significant differences among classical logic and 
Variant logic construction [50-56]. The main results of this chapter are summarized 
in the following theorems. 


Theorem 8 (Four Invariant Points for One Variable Condition) For a W coding 
scheme under one variable condition, four points of the structure correspond to four 
functions: {0, x, x, 1}, respectively. 


Proof When n = 1, four vectors are available for any Q or QV operations. | 
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Theorem 9 (Two Invariant Points for Truth and False Schemes) For any n > 1, 
W(WV) coding schemes, for any truth or false representation, only full O or full 1 
valued vectors can be invariant undertaken P operations. 


Proof Undertaken P operation, if there is any not full O or 1 vectors, its binary 
number sequences will be changed. E 


Theorem 10 (Four Invariant Points for C Coding Scheme) For any C (CV) coding 
scheme in variant construction, four corner positions of 2D function matrix have 
extreme invariant properties. 


Proof Under C(CV) coding scheme, four functions:{0, x, x, 1} correspond as fol- 


lows: x = (0|0);0 = (22 — 110); 1 = (0/2"" — 1); # = (277 — 1/27" — 1). 
Four positions are all corner points of the variant matrix. E 


6 Comparison 


It is convenient to list numeric parameters to compare the different coding schemes 
in the following table. 


Var |State|Function|ExPower|SL|W coding |WV coding|C coding|CV coding 
n p 27 pial 1 27 271 gntT ony 27 2n] gn+ign-Ty 
1| 2 4 2 1 8 8 4 4 
2) 4 16 24 1 384 192 32 32 
3 | 8 256 40320 | 1 |10321920| 645120 6144 384 
4 | 16 9 16! 1| 21616! 32-16! | 216.8! 32.8! 

5 | 32 2” 32! 1 | 23232! 64-32! |2°2-16!| 64-16! 


where we use Var: variable number; State: state number; Function: function number; 
ExPower: exponent power products; SL: SL coding number; W coding: W coding 
number under Q + P operations; WV coding: WV coding number under QV + P 
operations; C coding: C coding number under Q + P operations; CV coding: CV 
coding number under QV + P operations in the table, respectively. 


7 Conclusion 


In this chapter, variant logic has been proposed to extend truth table representation 
that describes variant properties of binary sequences. This extension is requiredto ex- 
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pand traditional Boolean logic framework to a new variation space. Under two types 
of vector operations, the new space has 272”! times more complexity than traditional 
Boolean function space with 27" members. In order to manage this complexity, the 
framework has proposed a series of global coding schemes encoded through sym- 
metric properties representing the elements in a matrix as a 2D map. Under this 
two-dimensional model, coding mechanism can be constructed and their invariant 
properties can be discussed. 

Boolean function space represents a core invariant functional space and the newly 
expanded space broadens the descriptions and coding schemes used. Thus, a wide 
area of variation coding can be developed. In essence, the space of binary sequence 
functions can be thought of as a keyboard with 27’ notes. Each note contains a 
complete Boolean function set and its own representation. The set of notes can be 
represented using a coding scheme that orders the notes in a particular sequence (SL 
and G codes) or their 2D maps (W, F and C codes). 

Under W coding representation mechanism, 2D matrix is suitable to visualize 
permutation sequences of n variable logic structures. Using invariant properties, 
classical logic and variant logic can be clearly identified. Further work on dynamic 
behaviours of complex dynamic systems can be explored. This chapter outlines 
the construction and notation of variant logic only. Future papers will show that 
the proposed scheme, with its foundation in symmetry, will have definite uses for 
predicting convergent and chaotic behaviour in dynamic binary systems such as the 
analysis of cellular automata rules using various visual methodologies. 
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Part I 
Theoretical Foundation— Variant 
Measurement 


All of mathematics is a tale about groups. 
—Henri Poincaré 


In geometric and physical applications, it always turns out that a 
quantity is characterized not only by its tensor order, but also by 
symmetry. 

—Hermann Weyl 


Nothing exists until it is measured. 
—Niels Bohr 


A list of research papers were published on variant measurements during 2011- 
2012. Two OA book chapters that are important to express core results of variant 
measurements (Chapter “From Local Interactive Measurements to Global Matrix 
Representations on Variant Construction, From Conditional Probability 
Measurements to Global Matrix Representations on Variant Construction”) are 
published in Advanced Topics in Measurements:339—400 (2012) by InTech Press. 

Part II is composed of three chapters (3-5). 

Chapter “Elementary Equations of Variant Measurement” provides the 
elementary equation of variant measurement to discuss four meta measures under 
permutative and associative properties. Two sets of sample partitions are expressed 
as sum of product of binomial coefficients in the elementary equation. This is a 
systematic approach to handle configuration space under four meta measures. 

Chapter “Triangular Numbers and Their Inherent Properties” uses triangular 
numbers to express inherent properties of 1D binary sequences under three 
parameters as an elementary equation. A set of interesting properties were explored. 
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This scheme provides efficient partitions to handle rotational invariant properties on 
binary sequences. 

Chapter “Symmetric Clusters in Hierarchy with Cryptographic Properties” 
describes symmetric clusters in hierarchy under multiple symmetric operations: 
combination, crossing, variant, and rotation conditions. Rich clusters were observed 
under various conditions. 


Elementary Equations of Variant Mm) 
Measurement sie 


Jeffrey Zheng 


Abstract Four variant measures are used to represent combinatorial functions 
including binomial coefficients. These variant measures are based on two types of 
m-bit vectors. Type A corresponds to non-periodic boundary conditions, while Type 
B corresponds to periodic boundary conditions. For each type, groups containing 
the four variant measures are formed, which are invariant against permutative and 
associative operations. By mapping two group elements of Type B on coefficients of 
binomial decompositions, patterns similar to Pascal’s triangle are observed. 


Keywords Variant measurement > m variable vector - Multinomial coefficient 
Permutative and associative operations + Global invariant 


1 Introduction 


For any n 0-1 variables, variant logic provides a2”! x 2?'-dimensional configuration 
space [16, 17] to support measurement and analysis [14, 15], which is a real difficulty 
for any practical activities [1, 9-11]. From a measuring analysis viewpoint [6-8, 13], 
it is essential to manipulate static states and their measuring clustering as effective 
measures to be a core content of any 0-1 measuring framework. In this chapter, 
starting from m variables of a 0-1 vector, binomial expressions are applied to support 
the four meta measures of variant partitions and associated multinomial expressions. 
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Using permutative and associative operations, various variation and invariant 
properties are investigated. From a global invariant viewpoint, various combinatorial 
clustering properties are systematically explored. 


2 Elementary Equation 


Let x be an m-bit vector, x = xox, ... Xi +++ Xm—1, Xi € {0, 1},0 <i < m, x € BY. 
Each x is an m bit state. From a variation viewpoint, there are two types {A, B} 
distinguished. Let {m_, m}, m—, mr} be four measuring operators. 


2.1 Type A Measures 


For a pair of (i, i + 1) elements, (x;, x;+1),0 <i < (m — 1) form partitions. (Non- 
periodic boundary conditions) 
Four measures can be calculated from the following equations. 


m—2 
mi@= > [Gn x1) == ©, 0) (1) 
i=0 
m—2 
mie = > 1G. x1) == (0, D] 2) 
i=0 
m—2 
m_(x) = [Gi x41) == (1, 0)] (3) 
i=0 
m—2 
my (x) = XOG xii) == (1, 1)] (4) 
i=0 
m=m_(x)+my(x) +m_(x) + m7 (x) +1 (5) 
From a clustering viewpoint, the last bit of x, x,,-; can be used to distinguish 
relevant combinatorial numbers. While x,,_,; == 1, there are ( 2 ) and for 
m++mr+1 
Xm—| == 0, there are (ae ), possible x vectors, where m, + mz is the number 


of 1 elements in a vector. By adding both binomial coefficients, Pascal’s rule [4] is 


obtained. 
m m— 1 m— 1 
Oe ee T © 
P Pp pol 


p(x) = m(x) +m7@)+1,0<5 p<m,x € By 
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2.2 Type B Measures 


A pair of (i, i + 1) elements is linked as aring, (xi, Xi+1(mod m)), 0 < i < m (Periodic 
boundary conditions). 


m—1 
m1 (x) = J [Cis Xi+10n0d m) == (0, 0)] 7) 
i=0 
m—1 
m4 (x) = >. [Ei Xitionod m) == (0, 1)] (8) 
i=0 
m—1 
m_(x) = DOL, Xit10mod m) == (1, 0)] (9) 
i=0 
m—1 
mr(x) = J otaa == (1, DI] (10) 
i=0 
m =m (x) +m4(x) +m_(x) + mr(x) (11) 


Let p be the number of | elements, p(x) = m4 (x) + m7 (x), then the number of 
possible x vectors is 


m 
(").o<pem. (12) 
P 


3 Partition 


Either Type A or B, internal parameters are associated with the four meta measures. 
For a brief analysis, Type B will be selected as initial part, multinomial coefficients are 
applied to partition relevant binomial coefficients. Using m variable, p number and 
q branches, the following equations are formulated. Under the partition condition, 
vector x can be ignored. 


m=m; +m, +m- +m (13) 
p=m,+mr (14) 
m-p-q=m, (15) 
ee (16) 
p-q>=m, (17) 
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Based on equivalent quantitative numbers, there are one-to-one corresponding on 
the four meta measures and relevant quantitative measures: 


{m,m}, m-, MmT} <> {m—p—4q,9,4,P—4} 


from a global restriction to establish an equivalent expressional framework. 

From an expressional viewpoint, different partitions are investigated from a single 
binomial coefficient to a set of multinomial coefficients with equivalent properties 
among different expressions. Their partitions undertaken on various levels are illus- 
trated in the following sections. From a binomial coefficient, there are multiple levels 
of representations involved, the first level and the nth level can be connected as 


A) = SH Gace aaa (18) 


P o LA 8i (p, k) 


O<p<m O<k<m. 


The core content of this chapter is to establish a global invariant framework using 
n levels of representations by deriving the functions f; and gz. 


4 Variation Space 


Let {a,b,c,d} be a set of four distinct measures. Two operations, permutative and 
associative, can be determined. For an ordered tuple with four measures (a, b, c, d), 
Permutative operator m: (a, b, c,d) —> (x(a), r(b), z(c), m(d)) to map one mea- 
sure to another measure. 

Associative operator a: {a, b, c, d} —> a{a, b, c, d} to group one to multiple mea- 
sures keeping the initial ordering. 

e.g. (a, b, c,d) — (b, d,a, c) is a permutative operation and 
{a, b,c, d} — {a, b}{c}{d} is an associative operation. 

A permutative operation changes the order of four tuple variables and an asso- 
ciative operation changes sequential relationship on its neighbourhood elements. In 
a normal arithmetical condition, two operations have conservative under add opera- 
tions with global invariant properties. From an algebraic viewpoint, two operations 
are independent. 


Lemma 1 For an ordering structure with four measures under two operations: per- 
mutative and associative, there are 192 configurations identified. 


Proof For a vector with 4 members, there are a total of 24 distinct permutations 
4! = 24. For an ordered set of 4 elements, 8 associated patterns are identified as 
follows: { {a,b,c,d}; {a} {b,c,d}; {a,b} {c,d}; {a,b,c Hd}; {a}{b} {c,d}; {a} {b,c} {d}; 
{a,b} {c}{d}; {a} {b}{c}{d}}. Two operations are independent, so the whole system 
contains 24 x 8 = 192 configurations. 
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5 Invariant Combination 


Using both permutative and associative operations, various combinatorial invariants 
can be identified. 


5.1 Type A Invariants 


Five invariant groups can be distinguished. 


Item | Set Cluster 
0 l} 1 
1 |{a,b,c,d} 

2a |{a}{b,c,d}; {b}{a,c,d}; {c}{a,b,d}; {d} {a,b,c} 

2b |{a,b}{c,d}; {a,c} {b,d}; {a,d} {b,c} 

3 |{a,b}{c}{d}; {a.ch{b}{d}; {ad} {b}{c}; {b.c}{a}{d}; {b,d}{a}{c}; {c,d} {b} {a} 
4 [{a}{b}{c}{d} 


FAW fhe 


Proposition 1 For a measuring structure with four members, Type A has 16 com- 
binatorial invariants distinguished (0 item: I cluster; 1 item: I cluster; 2a item: 
4 clusters; 2b item: 3 clusters; 3 item: 6 clusters; 4 item: I cluster). 


Proof Checking Type A conditions listed, all combinatorial conditions are exhaustive 
included. 


5.2 Type B Invariants 


For Type B, let b = c, following simplification can be performed. 


Item | Set Cluster 
0 I{} 1 
1 |{a,b,b,d} 1 
2a |{a}{b,b,d}; {b}{a,b,d}; {b}{a,b,d}; {d} {a,b,b} 
— |{a}{b,b,d}; {b}{a,b,d}; {d} {a,b,b} 3 
2b |{a,b}{b,d}; {a,b} {b,d}; {a,d} {b,b} 
— |{a,b}{b,d}; {a,d}{b,b} 2 
3 |{a,b}{b} {d}; {a,b} {b}{d}; {ad} {b}{b}; {b,b} {a} {d}; {b,d}{a}{b}; {b,d}{b} {a} 
—> |{a,b}{b}{d}; {a,d}{b}{b}; {bb} {a}{d}; {b,d} {a} {(b} 4 
4 |{a}{b}{b}{d} 1 
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Proposition 2 For a measuring structure with four members, Type B has 12 com- 
binatorial invariants distinguished (0 item: I cluster; 1 item: I cluster; 2a item: 3 
clusters; 2b item: 2 clusters; 3 item: 4 clusters; 4 item: I cluster). 


Proof Checking Type B conditions listed, all combinatorial conditions are exhaustive 
included. 


6 Combinatorial Expressions of Type B Invariants 


Applying mı =m — p —q,m, =m_,mz = p — q to replace {a, b, c,d}, there 
are 11 effective formula: 


Item | Set of measures Cluster 
{m} 1 
{m — p—q}{p +4}; {a}{m — q}; {p — q}im — p + q} 
2b |{m — p}{p}; {m — 24424} 
{ 
{ 


m— pHqaHp — q}; {m — 24 Hq Ha}; (244m — p — qHp — 4}; (pHm — p — ata} 
m -— p — q}laHaHp — a} 


=. ANU 


Corollary 1 Type B invariants include 11 nontrivial expressions. 


Proof Only 0 item is a trivial one. 


7 Two Combinatorial Formula and Quantitative 
Distributions 


From a combinatorial viewpoint, 1. item formula is a binomial coefficient CO). 
0 < p < m, to show various partition properties with relevant parameters. For conve- 
nient illustration, two expressions are selected: {m — p}{p} and {2q Hm — 2q} from 


2 clusters of 2b item of Type B. 


7.1 Case Il. {m — p\{p} 


In combinatorics, the following identity for binomial coefficients: 


AEO 
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is Vandermonde’s identity (or Vandermonde’s convolution), for any nonnegative inte- 
gers r, m,n. The identity is named after Alexandre-Théophile Vandermonde (1772), 
although it was already known in 1303 by the Chinese mathematician Zhu Shijie 
(Chu Shi-Chieh) [2, 3, 5, 12]. 

Applying Chu-Vandermonde’s identity to identify {m — p}{p} as fı and f2 in 
Eq. (18), the binomial coefficient in level n = 2 can be written as 


-ECG » 
-EC osrsn 


In this way, each binomial coefficient (7) is composed of p + 1 pairs of binomial 
coefficient multiplications and a total of sums on relevant groups. 


Theorem 1 For all coefficients of Type B, sum of all coefficients in {m — pH p}, 
0 < p < mis equal to 2”. 


0 (0) rE) 


p=0 


Proof Since 


SO 


According to Theorem 1, all parameters of { (zn) e )} are distributed in (m + 1)” 
2D array. 

For e.g., while m = 10, all coefficients are in 11 x 11 region and nontrivial values 
are composed of a triangle shape with reflect symmetric properties on p values. 


m>0,0<k p <m fon pD = ("7 PR): 
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fd0,p,H)012 34 5 67 8910p 


1 
15 25 15 
35 80 100 80 35 
28 63 90 100 90 63 28 
9 16 21 24 25 24 21 169 
111 1 1 1 1 1 111 


8 
7 
6 
5 
4 
3 
2 
1 
0 
k 


7.2 Case II. {2q}{m — 2q} 


Applying Chu-Vandermonde’s identity to identify {2q Hm — 2q} as fı and fọ in 
Eq. (18), the binomial coefficient in level n = 2 can be written as 


O a» 
p) &\k/\ p-k 
0<pĖ<m,0<q < |m/2] 


By using this formula, it is possible to select a special q value in {7 C to 
form |m/2| + 1 2D coefficient distributions. 


Theorem 2 For Type B {2q}{m —2q},0< p<m,0<q <|m/2] equation, 
selecting a proper value of q, all coefficients are distributed in |m/2] + 1 2D arrays 
and the sum of total coefficients in a 2D array is equal to 2”. 


Proof Since 


Pp m 
Ym >0,0 <q < |m/2], (") = eC Se ) =o. 


k=0 i=0 


SECC 


p=0 k=0 


so 


According to Theorem 2, es ) (a) coefficients are distributed in |m/2] + 1 
levels of (m + 1) x (m + 1) 2D planes. 
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For e.g., while m = 10, all coefficients are arranged on 6 levels of 11 x 11 regions 


with multiple symmetric properties. 


m > 0, (fma pk = ( 


2q 
k 


lee 
}:0<k,p<m,0<@ < |m/2] 


p—k 


012 3 


4 


5 


f0, 0, p, k) 
10 


Ke) 


m = 10,q = 0: 


TOrRFNWHNADAN C 


f (10, 3, p, k)|O 1 


jk 
© 


9 
8 
7 
6 
5 
4 
3 
2 
1 
0 
k 


4 24 60 80 60 24 4 
16152015 6 1 


2345 67 8910p 


1 6 1520156 1 
4 24 60 80 60 24 4 
6 36 90 120 90 36 6 
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fd0,5,p,.kK)}/0 1 2 3 4 5 6 7 8 9 10p 
10 


Ke) 


m = 10,4 = 5: 


1 10 45 120 210 252 210 120 45 10 1 


TOrRNwWKBNDW~ CO 


7.3 Result Analysis 


Two formulas selected from 2b item of Type B show completely different properties. 
In Case I, for a given m, all coefficients are distributed in one triangle area with 
reflection properties on p direction. 

However, Case II provides multiple levels of 2D distributions and each one is 
corresponding to a selected q value. From three listed conditions, q = 0 and q = 5 
are linear structures, the first one is located on diagonal positions of the plane and 
the second one is located on k = 0, p = {0, 1, ..., 10} a horizontal region. While 
0 <q <5, all distributions are shown in as parallelograms. Each line is shown in 
special symmetries. We can observe associated with variations of q values, horizontal 
projection keeps the same, however, the vertical projection will be changed from 
q = 0 binomial distribution, to be a pulse on q = |m/2] condition. This type of 
controllable properties could be useful to explore future advanced applications. 


8 Conclusion 


A new approach to decompose binomial coefficients under permutative and asso- 
ciative operations is proposed. Using this approach, it is feasible to investigate four 
meta measures in global invariant spaces. The resulting set of 192 configurations is 
categorized into standard group theory mechanism. From a statistic viewpoint, Type 
A (Five levels in 16 clusters) and Type B (Five levels in 12 clusters) provide global 
identifications on complicated partitions on wider restrictions, further theoretical 
explorations and practical applications are deeply expected in the coming period. 


Elementary Equations of Variant Measurement 49 


Acknowledgements The author would like to thank Chris Zheng for refined clustering analysis 
on random sequences to open a new way in binomial expressions, Yifeng Zheng and Kaiyu Yang 
for generating binomial coefficients in different conditions and Dr. Dennis Heim for correction of 
the chapter. 


References 


. J.R. Chen, Combinatorial Mathematics (Harbin Institute of Technology Press, Harbin, 2012). 


(in Chinese) 


. H.W. Gould, Some generalizations of Vandermonde’s convolution. Am. Math. Mon. 63(2), 


84-91 (1956) 


. H.W. Gould, Combinatorial Identities (Morgantown Printing and Binding Company, Morgan- 


ton, 1972) 


. M. Hall, Combinatorial Theory, 2nd edn. (Blaisdell, New York, 1986) 
. L.K. Hua, Loo-Keng Hua Selected Papers (Springer, New York, 1982) 
. D.E. Knuth, The Art of Computer Programming, vol. 1, 3rd edn. (Addison-Wesley, Reading, 


Massachusett, 1998) 


. D.E. Knuth, The Art of Computer Programming, Volume 4: Combinatorial Algorithms, Part 1, 


(Addison-Wesley, Boston, 2011) 


. F. Morgan, Geometric Measure Theory, 4th edn. (Elsevier, Amsterdam, 2009) 
. R.P. Stanley, Enumerative Combinatorics, vol. 1, 2nd edn. (Cambridge University Press, 


Boston, 1997) 


. D. Stanton, R. Stanton, D. White, Constructive Combinatorics (Springer, New York, 1986) 
. G.Z. Tu, Combinatorial Enumeration Methods and Applications (Science Press, 1981) (in 


Chinese) 


. Vandermonde’s identity. https://en.wikipedia.org/wiki/ Vandermonde’ s_identity 
. L.Z. Xu, MLS. Jiang, Z.Q. Zhu, Combinatorial Mathematics of Computation (Shanghai Science 


and Technology Press, 1983) (in Chinese) 


. Z.J. Zheng, A. Maeder, The The conjugate classification of the kernel form of the hexagonal 


grid, in Modern Geometric Computing for Visualization (Springer, 1992), pp. 73-89 


. ZJ. Zheng, Conjugate transformation of regular plan lattices for binary images, Ph.D. thesis, 


Monash University, 1994 


. J.Z.J. Zheng, C.H.H. Zheng, A framework to express variant and invariant functional spaces 


for binary logic. Front. Electr. Electron. Eng. China 5(2), 163—172 (2010). Higher Educational 
Press and Springer 


. J.ZJ. Zheng, C.H.H. Zheng, T.L. Kunii, A framework of variant logic construction for cellular 


automata, in Cellular Automata—Innovative Modeling for Science and Engineering, ed. by A. 
Salcido (InTech Press, 2011) 


50 J. Zheng 


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 
International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, 
adaptation, distribution and reproduction in any medium or format, as long as you give appropriate 
credit to the original author(s) and the source, provide a link to the Creative Commons license and 
indicate if changes were made. 

The images or other third party material in this chapter are included in the chapter’s Creative 
Commons license, unless indicated otherwise in a credit line to the material. If material is not 
included in the chapter’s Creative Commons license and your intended use is not permitted by 
statutory regulation or exceeds the permitted use, you will need to obtain permission directly from 
the copyright holder. 


Triangular Numbers and Their Inherent A) 
Properties giecik 


Chris Zheng and Jeffrey Zheng 


Abstract A method to classify one-dimensional binary sequences using three 
parameters intrinsic to the sequence itself is introduced. The classification scheme 
creates combinatorial patterns that can be arranged in a two-dimensional triangular 
structure. Projections of this structure contain interesting properties related to the 
Pascal triangle numbers. The arrangement of numbers within the triangular struc- 
ture has been named “triangular numbers”, and the essential parameters, elementary 
equation, and sequencing schemes are discussed as well as visualizations of sam- 
ple distributions, special cases, and search results. We believe this to be a novel 
finding as sequences generated using this method are not contained in the On-Line 
Encyclopedia of Integer Sequences or OEIS. 


Keywords Binary sequence - Classification - Combinatorial patterns - Triangular 
number - Elementary equation - Variant triangle 


1 Introduction 


Additive number theory [7], the study of integer subsets and their behavior under 
addition, is a branch of mathematics related to combinatorics. The simplest con- 
structs within this field are binomial coefficients [6]. The properties of binomial 
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coefficients have been explored by many over the history of mathematics [8, 9]. One 
generalization of the binomial coefficient is the multinomial coefficient [5, 8, 9]. 
Any multinomial coefficient can be expressed as the products of multiple binomial 
coefficients: 


oo S Cas ) 
ki, ko, ..., km ky kı +k = ky thy +--+ + km-1 l 
a) 


For this type of expansions, the simplest is the trinomial coefficient [10-13]: 
r r\/m 
= i (2) 
k,m—k,r—m m/\k 


1.1 Geometric Arrangement of Combinatorial Data 


In discrete geometry [2], as the most basic 2D shape, triangular patterns are found 
in such series as combinatorial triangle A102639, differential triangle A194005 [1], 
additive triangle A035312, and Pascal triangle A007318 [8, 9, 11, 13]. 

This chapter proposes a novel method of classification of binary sequences that 
is shown to be combinatorial properties in nature. By using a simple basis of binary 
(0-1) sequences and applying simple classification rules, a triangular structure can 
be generated. The set of results has been named “Generative Triangular Numbers”. 
The term generative [3] is used to describe the technique of using a simple input 
and a repeatedly applied process, creating emergent properties through repetition. 
Generative science [4] is a multidisciplinary science that explores the natural world 
and its complex behaviors as a generative process. Generative approaches can be used 
to simulate describe behaviors in fractals, cellular automata, and various nonlinear 
systems. 

The generated patterns are not currently found in the On-Line Encyclopedia of 
Integer Sequences (OEIS) potentially making them an interesting area for further 
research. 


1.2 Previous Work 


The current scheme is a derivative of the work of Zheng et al. [16, 17] to organize 
1D 0-1 sequences as certain N > 1 length vectors using three parameters in variant 
measurement construction and classifications on hierarchical discrete phase spaces 
in general. 

A trinomial equation is proposed as an elementary equation using three control 
parameters {g, p, N} [14, 15] to describe 0-1 vectors of N length as a subgroup, 
where N is the length of a vector, p indicates the number of elements with 1 values, 
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and q records the number of changes from either 0-1 or 1—0 as the vector in a 
circular form to form a 2D array with nontrivial triangular numbers. This type of 
elementary equation can be generatively applied to make relevant triangular numbers 
as a geometric distribution to form a hierarchical 3D array generatively. Based on 
this hierarchical 3D array, different integer sequences can be observed from this 
type of generative triangular numbers, and one projection on p direction is collected 
by Vandermonde’s identities to show their correspondences to standard binomial 
coefficients. Main results are provided by algorithms, theorems, and corollaries. 
Sample cases are illustrated and possible meanings are discussed. 


2 Definitions and Sample Cases 
2.1 Definitions 


Definition 1 Let X be a 0-1 vector, X = xy_|...x;...Xq with N elements as a 
state, x; € {0,1},0 <i <N. 


Definition 2 Let 2(N) denote a vector space contained all 0-1 vectors of N length 
Q(N) = {VX|0 < X < 2%} as an initial data set. 


Definition 3 Let (/) be a binomial coefficient, it satisfies 


i: ifn =k: 
(= 0, ifn £k,k>nork <0; (3) 


n! 


xmn- 5. otherwise. 


Under this condition, |2 (N)| = 2% forms a vector space with N length, respec- 
tively. 


Definition 4 For any selected vector X € 2(N), p(X) can be determined by 


N-1 
P(X) = >) xi, x; € {0, 1}. (4) 
i=0 


Lemma 1 Fora vector space S2(N), p provides a complete partition on a subgroup 
and the number of vectors in the subgroup is a binomial coefficient. 


Proof For a given p,0 < p < N, its combinatorial property makes a total number 


of (y ) = OD vectors identified to partition the vector space 92 (N). 
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Definition 5 For a circular vector X € (N), q(X) can be determined by 


Q(X) = SY) Gj = D&O = 1); xi, 2141 € 0, 1, (E+ 1) mod(N), (5) 


0<i<N 


e.g, N = 10, X = 1110011001, p(X) = 6(i = {0, 3, 4, 7, 8, 9}); 4 (X) = X(i = 
{2, 6}). 


2.2 Sample Cases 


Under this construction, any selected vector can be evaluated by the three parameters. 
Applying this set of parameters to create subgroups, interesting inner structures can 
be identified. That is, N = 4, all 16 vectors in the vector space, can be distinguished 
as six subgroups as a pair of (q, p) values shown in Table 1. 

Each subgroup is linked to their corresponding vectors in Table 2 

Enumeration numbers of relevant subgroup numbers are shown in Table 3. 


Table 1 Six subgroups for N = 4 vector space in (q, p) partitions 


q\p 0O i 2 3 4 
0 | @,0) (0, 4) 
(1,1) 0,2) (1,3) 

2 (2,2) 


Table 2 Six subgroups, vectors, and enumerating numbers 


ap) (X},N=4 No. 
(0, 0) {0000} 1 
(0, 4) {1111} 1 
(1, 1)| {0001, 0010, 0100, 1000} 4 
(1,2)| {0011,0110, 1100, 1001} 4 

4 

2 


(1,3)| {0111, 1110, 1101, 1011} 
(2, 2) (0101, 1010} 


Table 3 N = 4, (q, p) subgroup numbers and a projection 
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Table 4 Six levels of binomial coefficients and generative triangular numbers 


N| {p, N} Binomial Numbers| {q, p, N} Generative Triangular Numbers 
1 11 11 
2 121 La 
2 
3 1331 i i 
33 
1 1 
4 14641 444 
2 
1 1 
5 15101051 5933 
55 
1 1 
66666 
6 1615201561 9129 
2 


From Table 3, it is easy to verify that 16 vectors are sum of all possible num- 
bers from six subgroups. Subgroup sequences of all numbers are as the same as 
N = 4 binomial coefficients. Applying this corresponding from N = 1—6, six rows 
of original binomial coefficients can be created generatively as three-dimensional 
organization and each row {p, N} sequence corresponds a (q, p) triangular shape, 
respectively, shown in Table 4. 

This type of relationship can be expanded on generative mechanism from special 
casesof N = 1—6 to general conditions for any given N value. The detailed generative 
triangular mechanism is described in the next section. 


3 Elementary Equations 


Definition 6 Let f (q, p, N) denote a function for generative triangular numbers 
0< p< N,0 <q < |N/2], for two initial and end subgroups p = {0, N}, q = 0, 
let two functions of subgroups be f (0,0, N) = f (0, N, N)=1. 

For other subgroups, each case 0 < p < N, 0 < q < |N/2] is a subgroup under 
a given condition. Elementary equation of generative triangular numbers is proposed 
to use binomial coefficient expression in Eq. 6. 


eet 
‚P, N)= : 6 
fa, p. N) z—( j j=l (6) 
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Table5 N = 5, f(g, p, 5) subgroup numbers 


q\p| 012345 


Using this elementary equation, the list of values can be verified. For example, 
5 (4) (0 : 2) (2 ’ 1) 73 
FC, 1,5) = 300) = 5: FR 3,5) = 3G)(@) = 5: --- F2,4,5) = 76)() = 0. 
All {f (q, p, 5)} calculations are listed in Table 5. 


Corollary 1 The elementary equation has equivalent identities on a pair of {p, N — 


p} 
N a 
’ N= 
fa, p, N) <> { i ral 


= N a Gras (7) 
N—(N-—p) q q-1 


Proof Using the elementary equation, we have 


rE) ewin 
O N (N — p)! a 
~ (N-p)(N-p-q)4!\q-1 
ON (N-p-!)! (e-i) 
~ qa (N-p-q)!q-1!\4-1 
ee ay 
q q-1 q-1 
are aes 

q\q-1 q-1 
0 
P\@ q-1 

N a eee 


N—(N-—p) q q-1 


f(q.p,N)= 


J (equation 7) 


p parameters are in the vertical direction. In general condition for any given N, 
triangular numbers can be arranged in Table 6 (Fig. 1). 
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Table 6 N = 5, f(g, p, 5) subgroup numbers in vertical direction 


P\4 012| 
0/1 
1 5 
2 55 
3 | 55 
4] 5 
5 |1 
F(0,0,N) 
f(1,1,N) 
RUC) 
7 sX 
fF 
fC, pN) fap N) »- 
f(ULN—@,N) .. f4, N—4q,N) 
f(,N—1,N) : 
f(0,N,N) 


O<q<|FJ0<p<Nn 


Fig. 1 Triangular numbers for a given N > 1 


4 Local Propensities 


It is necessary to investigate different relationships for symmetry properties from the 
elementary equations to distinguish functions for generative triangular numbers. 


4.1 Nontrivial Areas 


57 


Corollary 2 (A pair of symmetric properties) In either O <q < p< N —q or 


q = 0, p = {0, N}, a pair of nontrivial trinomial coefficients on triangular num- 


bers satisfies 


f(q, p, N)= f4, N — p, N). 


Proof Using the elementary equation, two cases are required. 


(8) 
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Case 1: If q > 0, Eqs. 6 and 7 provide relevant combinatorial identities. Case 2: If 
q = 0, we have f(0,0, N) = f(0, N, N) = 1 by Definition 6. 


4.2 Trivial Areas 


Corollary 3 (Five areas for trivial values) Zf case I—q > 0,0 < p < q; case 2— 
N —q < p < N; case 3—q = 0,0 < p < N; case 4—q > 0, p = 0; case 5—q > 
0, p = N, then 


fq, p,N) =0. (9) 


Proof For cases 1, 2 and 3, we have 


N N—p\(p-1 
ene eS 
N-p\ q q-1 
N N — — 1 
( Ae ) =o) o<p<q : Case 1 
N-p\ q q-1 
— -1 
al Ie ”) =o (? )W-a<pen : Case2 
N — q q-1 


P 
N N — —1 
-=z ( a |ia =0,0< p <N : Cases 
For cases 4 and 5, we have 
N(N-p-1 -1 
Fa p.m) = —{ : E ) 
q q-1 q-1 
N(N-1 —1 
~( )[( ) =0].a>0.p=0: Cased 
q\q-1 q-1 
a -1 N-1 
=e =0 ,g>0,p=N : Case5 
qi\q-1 q-1 


= 0. 


5 Projection Properties 
5.1 Linear Projection 
In this section, the algebraic properties of linear projection are investigated. 


Definition 7 Let L(p, N) denote a function as a linear projection to collect all 
possible values for a given p,0 < p< N. 
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Table7 N =5, f(g, p,5) subgroup numbers and two projections 


p\q| 012 L(p, 5) = Yq F(a Ps 5) 
0 | 1 1 
1 5 5 
2 55 10 
3 55 10 
4 5 5 
S1 1 
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For the case of N = 5, two projections and their generative triangular numbers 


are shown in Table 7, respectively. 
Following theorems and corollaries are claimed. 


Theorem 4 Jf L(p, N) = = I (q, p, N),0 < p < N, then the projection func- 


tion L(p, N) is a binomial coefficient and 


N 
Lip. x) = ( ) 
p 


(10) 


Proof Fora fixed p,0 < p < N, all possible { f (q, p, N)} are collected to form the 


following equation: 


P 
Lp. N) = 0 f.p. N) 


O N (N — 1)! 
~ (N= p)(N— p— D)!p! 
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For a complete sequence of binomial coefficients, it is necessary to include both 
initial and end subgroups. Further corollaries can be established. 


Corollary 5 For any given N > 0 under the listed condition, a set of projection 
function {L(p, N)},0 < p < N is composed of the same sequence of binomial coef- 
ficients 


N 
Lip. w) = ( J (11) 
p 


Proof For 0 < p < N condition, they are well determined by Theorem 5.1 and two 


end subgroups p = {0, N}, (7) = X) = | by defined initial conditions. 


Corollary 6 The sum of all possible {L(p, NF o is 


N 
XO L@N) =2", (12) 
p=0 

Proof Collecting all possible numbers by Corollary 2, we have 


N N N 
J LON) = | ) 
p=0 P 


p=0 
=(14+1)* 
=, 


Corollary 7 For 0 < p < N, a pair of functions has an equivalent formula 
L(p, N) = L(N — p, N). (13) 


Proof By Corollary 2.1, both equations are equal. 


Theorem 8 For any N > 0, the sum of all possible functions on {f (q, p, N\wp,vq 
or {L(p, N)} o is equal to 2N 


N 
YY sap N= Do Lp, N) = 2". (14) 


Vp Yq p=0 


Proof By Corollary 6, two equations are equal. 
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5.2 Triangular Sequence 


Definition 8 For a given N > 1, let T(N) denote a 2D structure with all nontrivial 
triangular numbers. 


T(N) ={f@ p, NIF@,p,N)>90,0<4q<|N/2],0<sp<N} (15) 


Corollary 9 Fora given N, if |T (N)| be a total number of distinguishable elements 
for nontrivial triangular numbers, then |T (N)| has the following equation: 


N?/4 +2; N =0, (mod 2) 


ITM) =} 8 Eas 
(N4 — 1)/4+2; N = 1, (mod2). 


(16) 


Proof By Corollary 2 for a given N, a triangular shape for nontrivial members is 
composed of two parts: a triangular area and two q = 0 points. The triangular area has 
(N — 1) length and |N/2] high. If N = 0, (mod 2), the triangular area is a regular 
triangle contained N?/4 elements, so the total number of the generative triangular 
shape is N?/4 + 2. For an odd valued N, a triangular area has additional |N/2] 
members side on a regular triangle with | N/2|? elements, so the total number of 
elements is |N/2]? + LN/2] +2 = (N? — 1)/4 + 2. 


Definition 9 Fora given N > 1, let T S(N) denote an integer sequence with |T (N) | 
elements for all nontrivial triangular numbers in T (N) 


TS(N) := [f (0,0, N), f(0,N,N),..., 
...,f(9g,9,N),.-.-, f(g, p,N),..-,f(q,N —q,N),..., 
wee FCLN/2], LN/2], N), FCLN/2], [N/21, N)], 
1 <q <[N/2], q spsN-q. (17) 


5.3 Linear Sequence 


Definition 10 Fora given N > 1, let L(N) denote a 1D structure with relevant linear 
numbers. 


L(N) = {L(p, N)|0 < p < N} (18) 


Corollary 10 Fora given N, if|L(N)| be a total number of distinguishable elements 
for linear numbers, then |L(N)| satisfies Eq. 19. 


ILIN) =N+1 (19) 
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Table 8 {7(4), T(5), T(6)}, {L(4), L(5), L(6)} subgroup numbers in three levels 


N| T(N)| q:=01 2 3] p) Li) | LW) 
1 0 1 

4 1 4 

4| T4 := 42 |2| L := 6 
4 3 4 

1 4 1 

1 0 1 

5 1 5 

5| T66) := 55 |2| LO)! 10 
55 |3 10 

5 4 5 

1 5 1 

1 0 1 

6 1 6 

69 |2 15 

6| T(6):= 6122| 3| L):=| 20 
69 |4 15 

6 5 6 

1 6 1 


Definition 11 Fora given N > 1, let LS(N) denote an integer sequence with |T (N) | 
elements for all linear numbers in L(N) (Table 8) 


LS(N) := [L(0, N),..., L(p, N),..., L(N, N)], O< p<QN. (20) 


From the listed six groups of {T (4), T (5), T (6)} and {L(4), L(5), L(6)} struc- 
tures, two integer sequences are arranged as follows: 


TS(4), TS(5), TS(6) := [1, 1, 4, 4, 4, 2, 1, 1,5, 5,5, 5, 5,5, 1, 1, 6, 6, 6, 6, 6, 9, 12, 9, 2]; 
LS(4), LS(5), LS(6) := [1, 4, 6, 4, 1, 1,5, 10, 10, 5, 1, 1, 6, 15, 20, 15, 6, 1]. 


6 Sample Cases 


Two sample cases are selected for N = {17, 18} to show their triangular numbers 
and generative structures in Table 9. In relation to relevant integer sequences, both 
{L(16), L(17)} and {T (16), T (17)} are shown in Table 9. Two integer sequences are 
significantly different. The triangular number sequence in this case with a total length 
of 140 integers is three times longer than the linear number sequence with a total 
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Table 9 Triangular number arrays for N = {16, 17} cases 


N L(N) T(N) 
1 1 
16 16 
120 16, 104 
560 16, 192, 352 
1820 16, 264, 880, 660 
4368 16, 320, 1440, 1920, 672 
8008 16, 360, 1920, 3360, 2016, 336 
11440 16, 384, 2240, 4480, 3360, 896, 64 
16 L(16) := 12870 16, 392, 2352, 4900, 3920, 1176, 112, 2:= T(16) 
11440 16, 384, 2240, 4480, 3360, 896, 64 
8008 16, 360, 1920, 3360, 2016, 336 
4368 16, 320, 1440, 1920, 672 
1820 16, 264, 880, 660 
560 16, 192, 352 
120 16, 104 
16 16 
1 1 
|L(16)| = 17 IT(16)| = 66, Dvg.vp F(a; p, 16) = 65536 = 216 
1| 1 
17 17 
136 17, 119 
680 17, 221, 442 
2380 17, 306, 1122, 935 
6188 17, 374, 1870, 2805, 1122 
12376 17, 425, 2550, 5100, 3570, 714 
19448 17, 459, 3060, 7140, 6426, 2142, 204 
24310 17, 476, 3332, 8330, 8330, 3332, 476, 17 
17 LAT) := := T(17) 
24310 17, 476, 3332, 8330, 8330, 3332, 476, 17 
19448 17, 459, 3060, 7140, 6426, 2142, 204 
12376 17, 425, 2550, 5100, 3570, 714 
6188 17, 374, 1870, 2805, 1122 
2380 17, 306, 1122, 935 
680 17, 221, 442 
136 17, 119 
17 17 
1} 1 
|L(17)| = 18 IT NAD) = 74, Vva.wp f (4; P, 17) = 131072 = 217 
El |LS(16), LS(17)| = 35 = |7 (16), TADI = |T(16)| + |7.(17)| = 140, 
|7(16)| + |T7)| Dvg.vp nee £(, p, n) = 196608 = 2!6 + 217 
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length of 35 integers. Two integer sequences represent different partition results on the 
same number 196608 = 2!° + 2!” for generative binomial and trinomial coefficients, 
respectively. 


7 Conclusion 


Due to the proposed elementary equation of trinomial coefficients with excellent 
symmetric properties on a 2D grid similar to binomial coefficients on a 1D line, 
projecting operation makes 2D T(N) array be 1D linear L(N) array, respectively. 
Two types of T S(N) and LS(N) integer sequences can be generated. As the simplest 
expansion of multinomial coefficients, discrete 2D geometry could provide solid 
combinatorial foundation to support multinomial explorations. 

From a combinatorial geometry viewpoint, triangular numbers provide a key 
construction to link between trinomial and binomial representation in mathemati- 
cal foundation. Trinomial integer sequences, as representatives, need to be deeply 
explored by modern combinatorial & discrete mathematical societies. Further explo- 
rations are expected on detailed analysis and systematic construction on both and 
practical applications. 
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Symmetric Clusters in Hierarchy A) 
with Cryptographic Properties giecik 


Jeffrey Zheng 


Abstract Symmetric Boolean functions play a key role in stream ciphers. 
Symmetric constructions provide core components in cryptographic applications. In 
this chapter, four meta symmetric clustering schemes (combination, crossing, variant 
and rotation) are organized in a hierarchy for n variables of 0-1 vectors in measuring 
phase spaces. Local counting properties in a cluster and global counting properties 
in a given level are formulated. From selected symmetric clusters, a number of vari- 
ous symmetric Boolean functions are formulated. Counting properties on symmetric 
clusters, vectors in selected clusters and special symmetric Boolean functions are 
listed. Four sets of symmetric Boolean functions are compared. Properties of sym- 
metric clusters and Boolean functions are discussed. Main results are expressed in 
theorems and tables. Among four meta schemes, the variant scheme presents novel 
properties approximately with O (n? / 4) clusters on a 2D phase space different from 
other schemes: combinatorial O (n), crossing O (n/2) and rotation O (2”/n) on 1D 
measuring phase spaces, respectively. The variant pseudorandom number generator 
is a similar approach on RC4 and HC128 stream ciphers using word-oriented 0-1 
vectors. Further advanced researches and explorations on relevant optimal configu- 
rations are required. 


Keywords Symmetric construction + Meta symmetric Cluster + hierarchy 
Boolean function - Four meta schemes » Phase space 
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1 Introduction 


Symmetric Boolean functions [5] have being widely used as components of different 
cryptosystems [25] (e.g. in stream ciphers, block ciphers or hash functions). In com- 
binatorial mathematics [10], a symmetric Boolean function is a Boolean function 
whose value does not depend on the permutation of its input bits [4], i.e. it depends 
only on the number of ones in the input on n variables of 0-1 vectors [21]. A total of 
2” vectors are composed of a vector space or a phase space for the construction [19]. 
For a specific symmetric Boolean function, it is necessary to have invariant properties 
undertaken a special group of permutations [18]. For example, rotation symmetric 
Boolean functions are invariant under the circular translation of indices. In addition 
to rotation symmetric properties, multiple invariants (combination, crossing, reflec- 
tion, translation) may be composed of various symmetric subgroups of permutations 
[10, 22]. Various combinatorial counting schemes are explored [34-36]. 


1.1 Symmetric Functions—Combinatorial Invariant 


From a combinatorial viewpoint, symmetric Boolean functions are a combinatorial 
invariant that links to the number of one elements p, 0 < p < n ina vector [35]. In 
combinatorics, this type of function has being linked to binomial coefficients, and 
normally, there are n + 1 partitions to distinct the parameter of a measuring phase 
space into various clusters [30]. Symmetric Boolean functions are characterized 
[36] by the fact that their outputs only depend on the p numbers of their inputs. 
The usefulness of symmetric functions in a cryptographic context has being widely 
explored which possess good cryptographic properties [6, 7]. 


1.2 Crossing Number - Topological Invariant 


A zero-crossing [23] describes a point where the sign of a mathematical function 
changes (e.g. from positive to negative), represented by a crossing of the axis (zero 
value) in the graph of the function. It is a commonly used term in electronics, math- 
ematics, sound and image processing. 

From a measuring viewpoint, a 0—1 vector with n bits can be expressed as a 
circular ring that has a fixed crossing number q, 0 < q < |5] distinguished a number 
of derivative changes on either 0-1 or 1-0, respectively. This type of derivative 
invariant is widely used in crypto-analysis for many years. In NIST random data 
testing packages [1], binary derivative [3] and Runs tests [2] play an important role 
to measure the randomness of a binary sequence formed by a pseudorandom number 
generator for use in cipher systems. From an analytic viewpoint, this parameter is a 
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topological invariant and different from a combinatorial invariant to provide another 
type of partition capacities to organize a set of clusters in a measuring phase space. 


1.3 Rotation Symmetric Functions - Geometric Invariant 


In combinatorial mathematics, rotation symmetric properties are widely explored 
from early stage of abstract group theories and symmetric group constructions [10, 
22] as a geometric invariant. Filiol and Fontaine [12] were initially explored on 
balanced Boolean functions with a good correlation immunity. Pieprzyk and Qu [26] 
were applied in crypto-applications to use Rotation Symmetric Boolean Functions 
(RSBF) as components in the rounds of a hashing algorithm. 

Extensive R&D activities on RSBF are continuous for last decades, a list of 
advanced works explored, such as degree and non-linearity [6], optimal algebraic 
immunity [7], bent and semi-bent functions [8, 33], non-linearity of resilient, non- 
linear Boolean functions [20, 28], balanced Boolean functions [12, 16], non-linear 
balanced Boolean functions [31], weights and non-linearity [11], immune combining 
functions [32], count and cryptographic properties [13, 29], etc. 


1.4 Trinomial Coefficients 


It is a natural approach [10, 18, 19] to apply binomial coefficients to partition a 
measuring phase space on 0-1 vector sets. However, when parameters increase more 
than three, a generalization [34—36] using multinomial coefficients may not provide 
a general solution on further refined partitions, if the processed phase space is com- 
posed of 0-1 vectors. It is convenient for us to use a trinomial expression to show 
this fact. 

Letn = nı +n2+73,0 <n, 


n n! 
ni, n2, N3 ny!no!n3!’ 


collecting all possible trinomial coefficients, we have 


n 
n1, N2, N3 


Yn1,n2,n3 


From Eq. 1, it is interesting to notice that trinomial coefficients provide further 
segments to partition three-valued 0-2 vectors. Due to this reason, extensions using 
multinomial coefficients may not be directly relevant to binary-valued 0-1 vector 
sets. Refined identity equations of combinatorics are required [14, 15]. 
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1.5 Variant Symmetric Schemes - Variant Invariants 


Various schemes to use multiple invariants to partition special phase spaces have 
being explored in binary image analysis and processing for many years. In 1990s, 
Zheng [39, 40] proposed conjugate classifications to apply seven invariants in a 
hierarchy to partition the kernels of four regular plane lattices onn = {4, 5, 7, 9} cases 
for 2D binary images. For n-tuple 0-1 vectors, variant logic frameworks [41, 42] 
are proposed in 2010s, various applications are explored, such as 3D visual method 
[37], variant Pseudorandom Number Generator (PRNG) [38, 43], computational 
simulation on quantum interactions [44—47] and non-coding DNA analysis [48-50]. 


1.6 Organization of the Chapter 


In this chapter, an algebraic equation of variant trinomial will be proposed as a kernel 
structure to arrange a hierarchical phase space. This extension provides a general 
framework of multiple symmetric operations to support three numeric numbers: 
combinatorial, crossing and variant in a hierarchy. Three meta clusters of measuring 
phase spaces are identified by the three invariants: {n, p, q} and their combinations. 
Refined levels can be compared with the rotation symmetric scheme under n = 
{1, 2, 3,4, 5} conditions. Similarities and differences among the four schemes are 
explored. 

In Sect. 2, symbols and local counting properties of symmetric clusters in mea- 
suring spaces are defined, algebraic equations are formulated and two important 
projections are discussed. In Sect. 3, variant symmetric clusters and their elemen- 
tary equation are proposed. In Sect. 4, four number sets of symmetric clusters are 
explored from a global viewpoint. In Sect. 5, symmetric Boolean functions of selected 
clusters are constructed and both algebraic and approximate numeric properties are 
discussed. In Sect. 6, cryptographic properties of symmetric Boolean functions in 
a hierarchy are discussed and special properties on the variant scheme are stressed. 
Section 7 is the conclusion of the chapter. Main results of the chapter are expressed 
in a list of theorems and corollaries in Sects. 2-5, respectively. 


2 Symmetric Clusters in Measuring Phase Spaces 


In this section, basic symbols, primary definitions and algebraic formulas are defined 
for different clusters in their measuring phase spaces. 
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2.1 Basic Symbols 


Main symbols in this chapter are listed in Table 1. 


2.2 Primary Definitions 


Definition 1 (x an n-tuple vector on 0-1 variables) Let x be a 0-1 vector with n 
length. 


X = (Xn-1; -< -3 Xi,---,X0),0 <i <n, x; € {0, 1} = B2, x € B3, (2) 


e.g. x = 110010, n = 6. 


Table 1 Basic symbols 


Symbol Notes 

n Number of 0-1 variables, 1 < n 

x 0-1 vector x = (%,_1,---,Xj,---,X0), X; € {0, 1} = Bo, O<i<n 
I I(x) index for a vector x 

(n) Phase space of vector set {x}, 2 (n) = {Vx|0 < I < 2”} 

fon) Number of vectors in 2 (n) 

R R(x, r) rotation operator 

F F(x) reflection operator 

p p(x) number of 1’s elements in x,0 < p < n 

q q(x) number of cyclic crossings either 0-1 or 1—0 in x 

L(p,n) Combinatorial cluster of vectors in 2 (n), L(p, n) C Q (n) 

E(q,n) Crossing cluster of vectors in Q(n), E(q,n) C Q(n) 

V(q, p,n) Variant cluster of vectors in 2(n), V (q, p,n) C 2(n) 

G(m,n) m-th rotation symmetric cluster of vectors in 2 (n), G(m, n) C Q(n) 
felan) Crossing number of vectors in a cluster E (q, n) 

FL(p,n) Combinatorial number of vectors in a cluster L(p, n) 


f4, p,n) fv (4, p,n) variant number of vectors in a cluster V (q, p, n) 


foe(m,n) Rotation number of vectors in the m-th cluster G (m, n) 

O(N) Approximate number of N 

Cx(n) Approximate number of clusters in a set of {X(.)}, X € {E, L, V, G} 

fxn) Approximate number of clusters in a set of {X(.)}, X € {E, L, V, G} 
SFy(n) Number of Symmetric Boolean Functions (SBF) in {X (.)}, X € {E, L, V, G} 


SFxp(n) Number of balanced SB Fy in {X(.)}, X € {L, V, G}, n = 0 mod 2 
SFep(n) Number of balanced SBF in 3q, {E(g,n)},n = 0 mod 4 
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Definition 2 (J index for a vector x) For a vector x, let J or I (x) be an index: 
n—1 
I=1%)=) xus, (3) 
i=0 


e.g. x = 110010, I(x) = 25 +24 +2 = 324 16+2=50. 


Definition 3 (2 (n) a full set of n-tuple 0-1 vectors) Let 2 (n) be a vector space or 
a phase space of all n-tuple 0—1 vectors, 


Q(n) = {Vx|0 < I < 2",x € BY} and Q(n) = B}. (4) 
Definition 4 Let fg (n) denote a number of vectors in 2 (n). 
Lemma 1 fgo(n) is equal to 2”. 


Proof Fora vector x € B} from0...0to1...1, its index J can cover a full region 
of 0 < I < 2”, so 22(n) contains 2” distinct vectors and fgo(n) = 2”. 


Definition 5 (Measuring Phase Space) If a phase space can be organized by various 
invariants, then it is a measuring phase space and its dimension is determined by a 
number of active invariants. 


Corollary 1 For any n > 0, Q (n) is a measuring phase space in zero dimension. 
Proof For any n > 0, §2(n) is composed of one cluster of vectors as a single point. 


Definition 6 (R rotation operator) Let R(x; r) be a rotation operator on a vector x 
rotation —n <r < n positions: 


Rat) = R Onisa Xi, +++, X03 7) 
= (Xn-1+r mod n» -++ Xi+r mod n» » +», X0+r mod n), (5) 


e.g. x = 110010, {R (x; r)} o = {110010, 100101, 001011, 010110, 101100, 011001} 
with six distinct vectors. 


Lemma 2 (Maximal cyclic structure) Initially from any vector x under a rotation 
operator, at most n distinct vectors will be distinguished under the rotation operator. 


Proof From any x, a set of {R(x; ee with n vectors can be generated. If the listed 
set of n vector sequences contains more than one cycle, then the number of distinct 
vectors will be less than n. 


For example, x = 110110, {R(x; r)} o = {110110, 101101, 011011, 110110, 
101101, 011011} with only a set of three distinct vectors: {110110, 101101, 011011}. 
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Definition 7 (F reflection operator) Let F (x) be a reflect operator, 
F(x) = F(Xn-1,---, Xin XQ) = (X0, ---, Mine Xp_-1),0 Si <n. (6) 


Lemma 3 (A pair of reflections) For any vector x, only two results are distinguished 
under F(x) operation: (1) F(x) = x; (2) F(x) £ x. 


Proof (1) If F(x) = x, then the values of the vector x are distributed as a central 
symmetric form; (2) if F(x) Æ x, then the vector x does not have a symmetric 
distribution. 


For example, x = 110010, F(x) = 010011; y = 110011, F(y) = 110011. 
Definition 8 (p number of one elements) Let p or p(x) be a number of one elements 
in x, 


n-1 
p= p(x)= > x,0<psn. (7) 
i=0 


For example, x = 110010, p(x) = 3; y = 110011, p(y) = 4. 


Definition 9 (q number of cyclic crossings) Let q or q(x) be a number of cyclic 
crossings either 0-1 or 1-0 in a vector x, 


q=40)= DY) = O&O 


O<i<n 


= y (x; = O)&(x;_-) = 1); x, x-1 € Bo, (i —1) mod n; 


O<i<n 


1); xi, Xi41 € B2, i +1) mod n; 


n 
O<q< Ls]. (8) 


For example, x = 110010, g(x) = 2; y = 110011, g(y) = 1. 


2.3 Counting Properties on Rotation Clusters 


Definition 10 (G(m, n) m-th rotation symmetric cluster) Let G(m, n) be an m-th 
rotation symmetric cluster of vectors, G(m, n) = 2(n|m) C 2 (n) in 2(n), and let 
a total number of rotation symmetric clusters be Cg(n), 1 < m < Cg(n), 


Ce(n) Ce(n) 


2m) = |] Qaim) = |] Gm,n). (9) 
m=1 


m=1 
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Corollary 2 A setof{G(m, ete is composed of a measuring phase space in one 
dimension. 


Ce(n) 


Proof Using the parameter m, {G(m,n)},,2, can be listed in a linear order. 


Lemma 4 By Burnside’s lemma, ọ being Euler’s phi-function, 
Con) = ~ Yount. (10) 
k\n 
Proof A brief proof of this lemma can be found in [29]. 
Definition 11 Let f¢(m, n) denote a number of vectors in the m-th cluster G(m, n). 
Corollary 3 For any fg(m,n), 1 < fg(m,n) <n. 


Proof Due to Lemma 2, each fg(m,n) < n in general; for two special vectors in 
{0...0,1...1}, we have fg(m,n) = 1. 


Corollary 4 Collecting all possible rotation clusters, the total number of vectors is 
equal to fa(n) 


Ce (n) 
Yo fom, n) = 2" 
m=1 
= fo(n). (11) 


Proof From Lemma 4 and Corollary 3, it contains a full set of 2” vectors in Q (n). 


Lemma 5 Fora given n, Cg(n) has an approximate number, 
Dia 
Con) © OC). (12) 


Proof Using Corollaries 3 and 4, each distinct cluster contains at most n vectors; it 
is a natural to have such an approximate number in enumeration. 


It is convenient to list defined rotation parameters in Table 2 for n = 4 condition. 


2.4 Counting Properties on Measuring Phase Spaces 


For any vector x € §2(n), three measuring parameters {n, p, q} are represented as 
three invariants. Three measurements transfer a phase space into a set of measuring 
phase spaces in a hierarchy. 
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Table 2 Six rotation clusters, various vectors in {G (m, 4)} 


(m,n) G(m, n) fc (m,n) 
d,4) {0000} 1 
(2,4) {0001, 0010, 0100, 1000} 4 
(3,4) {0011, 0110, 1100, 1001} 4 
(4, 4) {0101, 1010} 2 
(5, 4) {0111, 1110, 1101, 1011} 4 
(6, 4) {1111} 1 
Ce(4) = 6 fan) = 16 


Definition 12 (L(p, n) combinatorial cluster) Let L(p, n) be a combinatorial clus- 
ter of vectors in 2 (n), L(p, n) = 2(n| p) C (n). Two parameters {n, p} partition 
the phase space 2 (n) to form a set of clusters {L (p, n)} in a measuring phase space. 


22(n|p) = L(p, n) = {¥x|0 < p <n, x € Qin}. (13) 


Corollary 5 A set of {L(p, n)¥p=0 is composed of a measuring phase space in one 
dimension. 


Proof The parameter p is the active invariant to arrange the phase space in a linear 
order. 


Definition 13 Let Cz (n) be a number of clusters in Vp, {L(p, n)}. 


Lemma 6 Fora given n, 
Ci(n)=n+1. (14) 


Proof Using Definition 12,0 < p < n and for any p, L(p, n) 4 Ø, the parameter p 
partitions the whole set 2 (n) into n + 1 distinct subsets as clusters. 


Definition 14 (f,(p,1) combinatorial number) Let f,(p,n) be a combinatorial 
number of vectors in a cluster L(p, n). 


Lemma 7 For a pair of {n, p} parameters, 


fi(p.n) = (C) (15) 
P 


Proof Using Definition 12, this number is equal to a binomial coefficient selected p 
elements from n positions. 


Itis convenient to list defined measuring parameters in Table 3 forn = 4 condition. 
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Table 3 Five clusters, various vectors in {L(p, 4)} 


(p,n) L(p,n) fi(p. n) 
(0, 4) {0000} 1 
d, 4) {0001, 0010, 0100, 1000} 4 
(2, 4) {0011, 0110, 1100, 1001, 0101, 1010} 6 
(3, 4) {O111, 1110, 1101, 1011} 4 
(4, 4) {1111} 1 
CL) =5 fa(4) = 16 


Definition 15 (E(q, n) crossing cluster of vectors) Let E (q, n) be a crossing cluster 
of vectors in §2(n), E(q,n) = 2 (nq) C 2(n). Two parameters {n, q } partition the 
phase space 2 (n) to form a set of clusters {E (q, n)} in a measuring phase space. 


n 

Q(n\q) = E(g,n) = {Yx]0 < q < Les Se) (16) 

Corollary 6 A set of {E(q, ay ear is composed of a measuring phase space in one 
dimension. 


Proof The parameter q is the active invariant to arrange the phase space in a linear 
order. 


Definition 16 Let Cz (n) be a number of crossing clusters in Yq, {E (q, n)}. 


Lemma 8 Fora givenn > 0, 
n 
Ce(n) = Ls! +1. (17) 


Proof According to Definition 15 and each E (q, n) 4 Ø, 0 < q < |3], the param- 
eter q partitions the whole set 2 (n) into |5] + 1 distinct subsets as clusters. 


Definition 17 (fg(q, n) number of vectors) Let fe(q, n) be a number of vectors in 
a cluster E (q, n). 


Lemma 9 For a pair of {n, q} parameters, 


n 


AKETE LŽ]. (18) 


felan) ae 


Proof Two cases can be distinguished: Case 1: q = 0; Case 2: 1 < q < L5]. 

Case 1: All values are either | or 0, 2 x (5) ie 

Case 2: For a given q, 2q crossing positions are composed of a pair of a 0-1 crossing 
then a 1-0 crossing repeatedly for q times in a vector and this configuration has a 
total of (3) vectors included, and the same pair of positions can be exchanged as a 
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Table 4 Three clusters, vectors in {E (q, 4)} cases 


(q.n) E(q,n) felq,n) 
0,4 {0000, 1111} 2 
(1,4) {0001, 0010, 0100, 1000, 0011, 0110, 12 
1100, 1001, O111, 1110, 1101, 1011} 
(2,4) {0101, 1010} 2 
Ce(4) = 3 fat4) = 16 


pair of 1—0 and 0-1 crossings with the same number of different vectors, so a total 
of 2 x (31) vectors are involved in each q selection. 


It is convenient to list above defined measuring parameters in Table 4 for n = 4 
condition. 


3 Variant Symmetric Clusters 


Definition 18 (V(q, p,n) variant cluster) Let V(q, p,n) be a variant cluster of 
vectors in 2 (n), V (q, p,n) = 2(n|p,q) C 92(n). Three parameters {n, p, q} par- 
tition the phase space 92 (n) to form a set of clusters {V (q, p, n)} in a measuring 
phase space. 


2 (n|p,4)= V(q, p,n) = {Yx|[0 < p<n,0 <q < [ka € 2(n)} (19) 


Corollary 7 A set of {V (q, p, n)}vq,p is composed of a measuring phase space on 
two dimensions. 


Proof Both invariants q and p are two active invariants to arrange the phase space 
on a 2D plane lattice. 


Lemma 10 Both {L(p, n)} combinatorial clusters and {E (q, n)} crossing clusters 
can be generated from special subsets of {V (q, p, n)} variant clusters. 


Proof For a given p, L(p, n) can be determined by 
L3] 
L(p,n) = |_] V4, p,n). 
q=0 
For a given q, E(q, n) can be determined by 


E(q.n) = |] V, p.n). 
p=0 
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Table 5 Three sets of variant clusters for n = 4 in {V (q, p, n)} condition 


q\P 0 1 2 3 4 E(q,n) 
0 |V(0,0,4) V(0,4,4)|E£(0,4) 

1 V(1,1,4) V(1,2,4) V(1,3,4) E(1,4) 

2 V (2,2,4) E(2,4) 
L(p,n)| L(0,4) L(1,4) L(2,4) L(3,4) L(4,4) | Q(4) 


Applying this set of partitions, three sets of relevant clusters can be identified. 

For example, n = 4, all 16 vectors in the vector space, three sets of clusters can 
be distinguished as six clusters {V (q, p, n)}, five clusters for {L(p, n)} and three 
clusters for {E (q, n)} shown in Table 5, respectively. 


Definition 19 Let Cy(n) be a number of non-trivial variant clusters in Yq, p, 
{V (q, p,m}. 


In general condition for any given n > 1, three sets of variant clusters could be 
shown in Fig. 1. 


Theorem 1 Fora given n, Cy (n) satisfies Eq. 20 


n?/4 +2; n = 0 mod 2 


C = 
MAS ay we na moda 


(20) 


Proof From Fig. 1 for a given n, a triangular shape for non-trivial variant clusters 
is composed of two parts: a triangular area and two q = 0 points. The triangular 


V(0,0,n) L(0,n) 
V(1,1,n) L(1,n) 
v(a) Vaan) L(q,n) 

. VLE LES] LL) 
V(L5I,151.2)| £03 1,") 

V(1,p,n) V(q,p,n) V(p,n) 
V(1,n—4q,n) Vana) ian) 
V(1,n—1,n) i L(n—1,n) 
V(0,n,n) L(n,n) 
E(0,n)  E(l,n) E(q,n) E(|3] n) Q(n) 


Fig. 1 Three sets of variant clusters {V (q, p, n)}, {E (q, n)}, {L(p,n)} forn > 1 
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area has (n — 1) length and |n/2] high. If n = 0 mod 2, the triangular area is a 
regular triangle contained n7/4 clusters, so the total number of this triangular shape 
contains n*/4 + 2 clusters. For an odd valued n, a triangular area has additional 
[n/2| clusters side on a regular triangle with |n/2|* clusters, so the total number of 
clusters is |n/2|? + [n/2| + 2 = (n? — 1)/4 + 2. 


3.1 Variant Trinomial Coefficients — Elementary Equation 


Definition 20 Let fy(q, p,n) or f(q,p.n) O< p<n,0 <q < |3] denote an 
enumeration function for a number of 0-1 vectors in a variant cluster. 


It is convenient to list relevant measuring parameters in Table 6 for n = 4 condi- 
tions. 


Definition 21 For two initial and end clusters p = {0, n}, q = 0, let two cases be 
f(0,0,n) = f(0,n, n) = 1. For other cases, each cluster 0 < p <n,0 < q < |3] 
contains a subgroup of vectors under a given condition. A variant trinomial coefficient 
for a number of vectors in a cluster is defined as an elementary equation in Equation 


21, 
fa, p,n) = —— is aes (21) 
n—p\ q q-1 


Applying variant trinomial coefficients in Eq. 21, there is no difficult to process 
more complicated cases in enumeration. Global arrangements on their triangular 
shapes are convenient to be arranged by p measures in vertical direction. Two cases 
n = {4, 5} are shown in Table 7. 

In a general condition for any given n > 1, three sets of various numbers can be 
shown in Fig. 2. 


Table 6 Six clusters, vectors in {V (q4, p, 4)} 


(4, p,n) V(q, p, 4) fa, p, 4) 
(0, 0, 4) {0000} 1 
(0, 4, 4) {1111} 1 
(1, 1,4) {0001, 0010, 0100, 1000} 4 
(1, 2, 4) {0011, 0110, 1100, 1001} 4 
(1, 3, 4) {0111, 1110, 1101, 1011} 4 
(2, 2, 4) {0101, 1010} 2 
Cy (4) =6 fo(4) = 16 
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Table 7 Three sets of vector numbers {f (q4, p, n)}, {fe(q, n)}, {fi (p, n)}(a) n = 4;(b) n = 5 
P\q |0 1 2| filp,4) 


0 1 1 
1 4 4 
2 42) 6 
3 4 4 
4 |i 1 


(ayn=4 

f(0,0,n) JL 0,n) 

f(1,1,n) SL 1 n) 

fan) faan) Jl qn) 
FBS EEL )| AU] n) 

a FULL 131} Azha) 

f(1,p,n) F(q,P,n) f(p,n) 
fQ,n-q,n) F(qn—4,n) fi(n—4q,n) 
fain) i faln 1n) 

F(0,n,n) fi(n,n) 


eO) feln) ~ feln) ~  fe(lal.n) | faln) 
0s <|4 


Fig. 2 Three sets of { f (4, p, n)}, {fe(g,)}, {f(p, n)} variant numbers for n > 1 


3.2 Combinatorial Projection on Variant Clusters 


From an algebraic viewpoint, the following theorems and corollaries are established 
for a general condition to meet any n > | cases. 


Lemma 11 /f fr(p,n) = oA f4, p,n),0 < p <n, then the projection func- 
tion fr(p,n) is a binomial coefficient and 


filp.n) = ("). (22) 


Proof For a fixed p,0 < p < n, all possible {f (q, p, n)} are collected to form the 
following combinatorial identities: [14, 15, 21], 
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Ë 
filp.n) = }_ f(a, p.n) 
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o n (n— 1)! 

~ (n=p)(n-p-!1)!p! 
n! 

— (n= p)!p! 


For a complete sequence of binomial coefficients, it is necessary to include both 
initial and end clusters. Further Theorem 2 can be established. 


Theorem 2 For any givenn > 0, a set of projection function { f; (p, n)Yp=0 İS com- 
posed of the same sequence of binomial coefficients 


filp.n) = ("). (23) 


Proof For 0 < p < n condition, the equation has been determined by Lemma 11 
and two end clusters p = {0, n}, (5) = (") = 1 are determined by Definition 21. 


Corollary 8 The sum of all possible { ft (p, n)¥p=o is equal to fo(n), 


Yo file. n) = fon) = 2". (24) 
p=0 


Proof Collecting all possible numbers in Theorem 2, we have 
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DADES (") 
p=0 P 


p=0 
=(14+ 1)’ 
= Qn 
= fon). 


3.3 Crossing Projection on Variant Clusters 


Lemma 12 Jf fe(q, N) = (ae fq, p.n),1 <q < L3], then the enumeration 
function fi (q, n) is a double of a binomial coefficient 


fe(q.n) = o(a) (25) 


Proof For a fixed q, collecting all possible { f (q, p, n)}p=4, the following combina- 
torial identities [14, 15, 21] are deduced: 


n—q 


feq n) =} fa, p,n) 
p=q 


n—p 
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_n(n-1\ ( N+1\ fk\(N-k 
“ile! Greed" EOC 
n (n — 1)! 


24 (n — 2q)(2q — D)! 
n! 


= 


> 
ll 


—— 
(2q)!(n — 2q)! 
n 
=9 


Theorem 3 For any givenn > 0 under the listed condition, a set of projection func- 
tion { fe (q, M)}o<q<|4) are composed of the subsequence of binomial coefficients, 


fr(q.n) = 2"). (26) 


Proof For 1 < q < |n/2] condition, equations are determined by Lemma 12 and 
for the initial subgroup, we have q = 0, f¢(0,n) = (o) + C) = 2(5). 


Corollary 9 Forn =0 mod 2,0 < q < n/2, there are a pair of symmetric func- 
tions 


fe(q.n) = fe(n/2—q,n). (27) 


Proof Under n = 0 mod 2 condition, 


fe(q.n) = a(z) 


n n 
(, - a m - P 


Corollary 10 Forn =0 mod 4,q =n/4, fe(n/4, n) has the maximal value 


fe(n/4,n) > felq,n),q <n/4. (28) 


Proof Undern =0 mod 4 condition, 


fe(q.n) = a(z) < 2( ‘i ) = 2( " ) = fe(n/4,n). 
2q n/2 2n/4 
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Corollary 11 The sum of all possible { fr(q,n)}o<q<\4) is equal to fo(n), 


Lz] 
X fen) = fon) = 2. (29) 
q=0 


Proof Collecting all possible numbers, we have the following equations: 


L3] Lz] 


Y felan) = 5) 
q=0 q=0 q 


a Ce) 2, (a) -È a :) = 


k>0 


3.4 Relationships of Four Symmetric Clusters 


Theorem 4 For any n > 0, the sum of all possible functions on { f (q, p, "}vp.vq 
or { fe(q, n)josasiż] OF {fL(P; n)}Yp=0 Or (fom, n)}, 1 < m < Ca(n) is equal to 
fon) 


lz] 


fom = 595 fa pn = >> fean = Y} flp n) 
p=0 


Yp Yq q=0 
Ca(n) 


= }_ felm,n) 
m=1 
= 2 (30) 


Proof From the results of Corollaries 4, 8 and 11, four schemes provide various 
partitions to the same set of vectors on 2 (n) completely. 


Corollary 12 Numbers of four symmetric clusters can be expressed by 
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Table 8 Numbers of four symmetric clusters in 1 < n < 16 


n 11234567 8 9 10 11 12 13 14 15 16 
Ce(n)\122334 455 6 6 7 7 8 8 9 
CL(n)|234567 8 9 10 11 12 13 14 15 16 17 
Cy(n)|2 346811141822 27 32 38 44 51 58 66 
Cc(n)|2 3 4 6 8 14 20 36 60 108 188 352 632 1182 2192 4116 


n 
Ce(n) = Ls! +l; 
Cin) =n+ 1; 
n?/4 +2, n = 0 mod 2 
ro ee el ; 
(n —1)/4+2, n=1mod2 
1 n 
=- k)2*. 
Con) = — Yok) 


k\n 


Proof Due to Lemmas 4, 6, 8 and Theorem 1, four equations for numbers of various 
symmetric clusters are listed. 


In convenient for comparison, their values on 1 < n < 16 are listed in Table 8, 
respectively. 
Checking real clusters in four schemes, the following corollaries can be provided. 


Corollary 13 Whenn = {1, 2, 3}, three cluster schemes Cr (n), Cy (n), Cg(n) pro- 
vide the same partitions of clusters. 


Proof Checking the three schemes, we have Cz (1) = Cy (1) = Cg(1) = 2, C (2) = 
Cy (2) = Cg(2) = 3, C_(3) = Cy (3) = Cg(3) = 4. Relevant cluster contains the 
same set of vectors. 


Corollary 14 When n = {1, 2,3, 4,5}, two cluster schemes Cy (n), Cg(n) provide 
the same partitions of clusters. 


Proof Due to Corollary 13, we need to check n = {4, 5} cases. For the two schemes, 
we have (C1 (4) = 5) 4 (Cv (4) = Ca (4) = 6), (Cx (5) = 6) # (Cy (5) = Ce (5) = 
8). Relevant cluster contains the same set of vectors. 


Corollary 15 Whenn > 6, four cluster schemes Cg(n), Cy (n), Cy(n), Cg(n) pro- 
vide different partitions on their clusters. 


Proof Due to Corollaries 13 and 14, we need to check n = {6, - - - } cases. For the four 
schemes, Cg (6) = 4, CL (6) = 7, Cy (6) = 11, Cg(6) = 14. Only a few clusters can 
contain the same set of vectors. 
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Corollary 16 Whenn > 6, three cluster schemes: combinatorial, crossing and vari- 
ant {Cg(n), CL (n), Cy(n)} may contain more symmetric properties than rotation 
clusters on Cg(n). 


Proof Considering a special case on {n = 6, p = 3,g = 2}, V(2, 3, 6) = {001101, 
011010, 110100, 101001, 010011, 100110, 011001, 110010, 100101, 001011, 010110, 
101100}; this cluster contains two cycles: {001101, 011010, 110100, 101001, 010011, 
100110} and {011001, 110010, 100101, 001011, 010110, 101100} with six vectors, 
respectively. Both cycles have rotation symmetries only without reflection symme- 
tries. Itis possible to use reflection symmetric operators to distinct two relative cycles 
to form a pure rotation symmetric structure. However, other clusters may contain 
more cycles such as L(3, 6) with four cycles and E(2, 6) with six cycles, respec- 
tively. It is necessary to apply other symmetric operators different from rotation for 
further separations. 


4 Four Number Sets of Symmetric Clusters 


4.1 Four Approximates on Numbers of Clusters 


Using the four numeric equations, relevant approximates can be expressed as follows. 


Lemma 13 Four approximates can be expressed as 


Cen) ~ o (Ë): 31 

ea) ~ 0(5): G1) 

Cin) ~ O(n): (32) 
2. 

Cy(n) ~ O (F) (33) 
gn 

Cen) © O (=). (34) 


Proof Using the four equations, the following approximates can be expressed: 


Crm) = |=} +1~0(=); 
EM) = L5] + X Gis 
Cry) =nt+1 O(n); 

n?/4 +2, rmon wo (T): 


1 i Qn 
Cg(n) = X okzi xo (=). 


k\n 
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4.2 Four Approximates on Numbers of Vectors 


Definition 22 Let fx(n), X € {L, E, V, G} denote an approximate number of vec- 
tors in X cluster. 


Lemma 14 Four approximates can be expressed as 


grr 

fen) © o( 7 j: (35) 
gn 

fin) + O (=) ; (36) 
gn+2 

fn) © o( -2 J (37) 

fe) © O(n). (38) 


Proof Since all clusters partition the same phase space §2(n) with 2” vectors, their 
approximates for vectors in a cluster can be evaluated, 


Qn n+ 
fein) = ~O ( ) ; 


o (3) n 
i 2" zo 2" : 
P (7); 
A ( ) gn ò (=) 
Or ogy ke 
2” 
fe) = o (2) x O(n). 


It is convenient to list approximate numbers on clusters, vectors and dimension 
of measuring phase spaces in Table 9. 


Table 9 Four approximate numbers on both clusters and vectors 


X Cx(n) fx(n) Measuring phase 
space 

E o (2) o (2 1D 

L O(n) o (2) 1D 

V O (7) o (==) 2D 

G O (2) O (n) ID 
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5 Symmetric Boolean Functions for Selected Clusters 


5.1 Four Numbers on Symmetric Boolean Functions 
Definition 23 Let S Fy (n) denote a number of Symmetric Boolean Functions (SBF) 
in {X(.)}, X € {E, L, V, G}. 


Theorem 5 (Four types of symmetric Boolean functions) Total numbers of four 
types of symmetric Boolean functions SFx(n), X € {E, L, V, G} are 


SFg(n) = 292 = gal", ea 
SF,(n) = 26 =+, (40) 
gn? /4+2 n = 0 mod 2 
— 9Cvin) _ l : 
SFy(n) = 200 = Ce n = 1mod2’ 4D 
SFe(n) =2%® = o (27), a 


Proof For any selected cluster, there are two selections for its symmetric Boolean 
functions. 


5.2 Four Numbers of Balanced Symmetric Clusters 
Definition 24 Let SFy,(n) be a maximal number of balanced SB Fy in {X(.)}, X € 
{L, V, G}, n = 0 mod 2. 


Definition 25 Let SFg,(n) be a maximal number of balanced SBF, in Aq, 
{E(q,n)},n = 0 mod 4. 


Lemma 15 Four selected numbers {Cxp(n)}, X € {E, L, V, G} for balanced sym- 
metric clusters are 


ii " n=0 mod 4. ois 
0, n#0 mod 4 

Cro(n) = l; (44) 

Cven) = 53 (45) 


c sol 46 
wa as 


Proof From Corollary 10 for Eb groups n = 0 mod 4 cases, q = n/4 provides a 
cluster with a maximal number of vectors in a balanced condition and other cases 
cannot satisfy balanced conditions; for Lb groups n =0 mod 2 cases, p = n/2 
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Table 10 Numbers of four balanced symmetric functions in 2 < n < 20 


n 246 8 10 12 14 16 18 20 
2ce)|1 21 2 1 2 1 2 1 2 
2m)}2 222 2 2 2 2 2 2 
2Cve(n) 21 22 93 94 25 26 27 28 29 210 
2C (n) 2! 22 24 210 O (272) O (2°) O (2°21) O (Zee) O Co O (Ze) 


provides a cluster with a maximal number of vectors in a balanced condition; for Vb 
groups =O mod 2 cases, p = n/2,1 < q < n/2, there are n/2 clusters involved 


in a balanced condition; for Gb groups n = 0 mod 2 cases, p = n/2, a total of 


7 3) could be involved in a balanced condition. 


rotation symmetric clusters O (1 


5.3 Four Numbers of Balanced Symmetric Boolean 
Functions 


Theorem 6 (Four balanced SYMMETRIC Boolean functions) Total numbers of 
four balanced symmetric Boolean functions {S Fxb(n)}, X € {E, L, V, G} are 


Ceti 2, n=O mod4 
SFgp(n) = 20 = : (47) 
1, n#0 mod 4 
SF, b(n) = 2 =2; (48) 
SFyb(n) = 2 = 23; (49) 
SFeb(n) = 26% = o (26). (50) 


Proof Each number of clusters in a selected scheme has been determined in Lemma 
15. For any selected cluster in the scheme, there are two selections to form relevant 
symmetric Boolean functions. 


In convenient for comparison, four types of SB Fy, numbers on 2 < n < 20 are 
listed in Table 10, respectively. 


6 Cryptographic Properties of Symmetric Boolean 
Functions in Hierarchy 


Boolean functions are of great importance in the design of random number generators 
for stream ciphers [25] that are widely used in modern network environment. 
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Due to cryptographically secure consideration, the sequence produced by the ran- 
dom number generator must satisfy the various properties [6, 8]: the longer period, 
the period complexity and good statistical distributions. There exists a huge theoret- 
ical knowledge of such combining generators [25]. 

A symmetric Boolean function must fulfil different necessary criteria to yield 
a cryptographically secure scheme, at least to resist known attacks [11]. In this 
direction, various measuring parameters play an important role such as balanced, 
support set, hamming weight, hamming distance, balanced function, non-linearity, 
correlation immunity, etc. [6, 8]. 

In relation to balanced properties, when n is even, the functions of highest non- 
linearity are the bent functions, and it is well known that the bent functions cannot be 
the balanced functions [28, 33]. From a structural viewpoint, the balanced functions 
having the highest possible non-linearity need to be considered. However, finding 
such functions is a very difficult problem [29, 31, 33]. When n is odd, exhibiting 
functions of the highest non-linearity is a hard problem in itself. Among the available 
candidates, balanced ones exist [16, 33]. 

To explore optimal functions in rotation symmetric Boolean function sets, many 
researchers are faced extremely difficulties on computational complexity even for 
n > 10 symmetric Boolean functions [29]. Exponentially increasing complexity 
makes a complex exhaustive search be quickly impossible. Compared with both 
variant and rotation schemes listed in Table 10, it is interesting to notice that the vari- 
ant scheme takes a numeric complexity on n = 20 as same as the rotation symmetric 
scheme on n = 10. Much faster computation on optimal functions could be feasibly 
explored. 

From a meta analytic viewpoint, measuring phase spaces provide multiple lev- 
els of construction in a hierarchy linked to various symmetric Boolean functions. 
They support an n tuple 0-1 vector construction as a word-based 0-1 vector to sat- 
isfy various design and analysis purposes. The variant PRNG construction [38, 43] 
is a similar approach to RC4 and HC128 stream ciphers [25] in their meta phase 
spaces using the word-oriented vector structure with the higher speed and efficiency. 
Measuring phase spaces could support advanced cryptographic applications on the 
direction. 

Due to significant differences between measuring phase spaces proposed and alge- 
braic normal forms classically formulated, in addition to initial balanced symmetric 
properties discussed in the chapter, other advanced comparison mechanisms need 
to be established for all interesting cryptographic properties to satisfy practical and 
optimal requirements for stream ciphers. Further detailed researches and explorations 
are required. 


7 Conclusion 


Symmetric clusters in a hierarchy provide the additional information to organize 
various symmetric Boolean functions into hierarchical constructions as multiple meta 
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levels of structures efficiently. The variant symmetric functions proposed in this 
chapter provide a meta construction on a 2D measuring phase space to contribute 
richer capacities compared with the three classical schemes (combinatorial, crossing 
and rotation) on 1D measuring phase spaces. 

From a measuring viewpoint, three schemes (combinatorial, variant and rotation) 
in Tables 8, 9 and 10 have similar values inn = {1, 2, 3} and {4, 5} or different values 
inn > 6conditions. The variant scheme provides a 2D intermediate structure differ- 
ent from other two schemes in 1D structure. From an approximate viewpoint, both 
combinatorial and rotation schemes are shown in stronger similar properties. Their 
approximate number of clusters and number of vectors in a cluster can be exchanged 
in Table 9. From an abstract system viewpoint, this pair of exchangeable measure- 
ments may provide approximate symmetric properties for both combinatorial and 
rotation schemes. 

From a clustering viewpoint, the most important results are summarized in Theo- 
rem 4 to show that the four symmetric cluster schemes are different partition schemes 
on the same 0-1 vector set. 

From a balanced analysis viewpoint, the key results of balanced symmetric 
Boolean functions are summarized in Theorem 6 and Table 10. This set of results 
provides a basic measurement to illustrate relevant computational difficulties to 
explore further optimal properties in balanced symmetric conditions. Different from 
other three schemes (combinatorial, crossing and rotation) in either very simpler 
or extremely complex associated with n increasing, balanced variant symmetric 
Boolean functions present very interesting patterns to support even n > 20 cases 
for future explorations. 

Many advanced properties are existed to use a meta hierarchical construction to 
manage relevant measuring phase spaces into multilevels of a hierarchical structure. 
Various measuring parameters can be used as control parameters in detailed cases. 
Refined design and analysis can be performed under this meta hierarchy to provide 
powerful models and tools on design and optimization for future generations of 
stream ciphers. 
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Part III 
Theoretical Foundation— Variant Map 


Arc, amplitude, and curvature sustain a similar relation to each other 
as time, motion, and velocity, or as volume, mass, and density. 


—Carl Friedrich Gauss 


As long as algebra and geometry have been separated, their progress 
have been slow and their uses limited; but when these two sciences 
have been united, they have lent each mutual forces, and have marched 
together towards perfection. 


—Joseph-Louis Lagrange 


The arithmetical symbols are written diagrams and the geometrical 
figures are graphic formulas. 
—David Hilbert 


In relation to variant map, a longer book chapter (Chapter “Interactive Maps on 
Variant Phase Spaces”) was published in the OA book of Emerging Application of 
Cellular Automata: 113-196 (2013) by InTech Press. This provides systematical 
approaches under statistical mechanics in comparison. Possible projections and 
their mapping mechanisms are explored. 

Part III is composed of three chapters (6-8). 

Chapter “Variant Maps of Elementary Equations” provides variant maps of 
elementary equation to generate visual distributions using two cases of combina- 
torial expressions. From two cases, it is interesting to see symmetric distributions 
under various parameters and complex distributions are created by control 
parameters shown in 2D and 3D distributions and their projections. 

Chapter “Variant Map System of Random Sequences” describes variant map 
system of random sequences; five types of maps are defined and proposed on two 
types of 1D maps and three types of 2D maps. A sample sequences from the AES 
cipher is selected and multiple maps are illustrated. 

Chapter “Stationary Randomness of Three Types of Six Random Sequences on 
Variant Maps” proposes a testing system for stationary randomness of random 
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sequences on variant maps. Three types of six random sequences are selected. Six 
samples are composed of three random resources: two block ciphers, two stream 
ciphers, and two quantum ciphers. Three variation categories are observed. 


Variant Maps of Elementary Equations ®) 


Check for 
updates 


Jeffrey Zheng 


Abstract Using four measures in Type B, there are 11 invariant expressions to 
form elementary equations of variant measurement. In this chapter, two invariant 
expressions are selected to illustrate sample procedures from elementary equations 
to relevant variant maps. Using various projections and multiple levels of represen- 
tations, complicated binomial coefficients and their variations are illustrated under 
various conditions. Using multinomial coefficients, multiple viewpoints are used for 
references. Due to this type of variation framework contains rich structures, further 
explorations are required from multiple levels on both theoretical foundation and 
practical applications. 


Keywords Variant measurement - Elementary equation - Variant map 
Multinomial coefficient - Coefficient array 


1 Introduction 


Variant construction starts from n 0-1 variables to form 2” states and 2?" func- 
tions, via vector permutation and complement operations on state space to estab- 
lish a variant logic framework to contain 2”! x 2" configurations as a variation 
space. Variant measurement acts as a core of quantitative measurement, starting from 
m 0-1 variables to explore relevant clustering conditions on 2” states. Since this type 
of variations has a close relationship to partition and recombination using binomial 
and multinomial coefficients under identically combinatorial expressions. Apply- 
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ing the results in Chapter “Elementary Equations of Variant Measurement”, Type B 
measures are composed of 11 nontrivial invariants. Two invariants are selected in this 
chapter, their different partition properties are illustrated to use coefficients on 2D 
and 3D distributions. Variant maps are generated from coefficient arrays as samples. 


2 Measures and Maps 


Two combinatorial invariants are selected: {m — pH p} and {2q }{m — 2q}. Different 
distributions on their coefficients are explored. 


2.1 Case 1. {m — p}\{p} 


For {m — p}{p} formula, relevant equation is 


(7) ~ 2 (” k P) a (1) 


A binomial coefficient is separated by sum of (p + 1) pairs of binomial coefficient 
products. For a selected value p, coefficients {(’",”)(?)},0 < k < p are arranged in 
a linear order. 

This property is true for all p values. A special three tuple structure (m, p, k) 
has 1-1 correspondence with a coefficient f (m, p, k) = ("",”)(?). While m value 
increased, coefficient array will be increased as a 3D rectangular steps, each m value 
has a (m + 1)? region. 

The nontrivial coefficients are distributed as a triangle. LetF (m, p) = X w 
f(m, p,k), 0< p <m and G(m, k) = ae f(m, p,k),0O <k <m, two projec- 
tions {F (m, p), G(m, k)} can be projected. Coefficients and relevant four maps are 
shown in Fig. 1. 


Lemma 1 For {m — p}{p} equation, coefficients are distributed in (m + 1)* 
and all nontrivial coefficients are clustered in 1/4 region and 3/4 regions has coeffi- 
cient 0. 


2.2 Case 2. {2q}{m — 2q} 


Briefly {m — p}{p} and {2q}{m — 2q} are simple invariants. For {2q Hm — 2q} 
invariant, it has the following equation. 
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Fop =Q fol 2 3 4 5 6 7 8 9 0p] Ch- 
0 Tlii%dti1%ti1%idi14 TI 
1 9 16 21 24 25 24 21 16 9 165 
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Fig. 1 One set of coefficients and its two projections in four maps (a)—(d); a 3D f(10, p,k);b2D 
f0, p,k);e 1D F(10, p); d 1D G(10, k) 


m a 2q\ (m — 2q 
(J ZEA) o 
P ro \E/\ PTR 
where q is a free variable, 0 < q < |m/2]. Different from Case 1, this equation 
can determine /floorm/2| + 1 levels of coefficients according to different q values 
selected to form a 3D coefficient structure. 
Let f(m, q, p-k) = (Y) (022) under 0 < q < |m/2],0 < k, p < m conditions, 
nontrivial coefficients are distributed in special shapes on multiple 2D regions. 
Using color coding scheme, it is feasible to map coefficients into greyscale or 
color pixels as variant maps. 
A binomial coefficient can be separated as sum of (p + 1) pairs of coefficient 
products 1E) CT) 0 < k < p to be a linear order. 
This type of property is true for all p values, a special tuple of four parameters 
(m, q, p, k) has 1-1 correspondence with coefficient £2) Ca Each selected m 
value is corresponding to (m + 1)? x (|m/2| + 1) region to locate all coefficients. 


Lemma 2 For {2q}{m — 2q} combinatorial invariant, all coefficients are restricted 
in (m +1)? x (Lm/2] + 1) region. 
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3 Visual Results 


It is convenient to use color coding to transfer each coefficient as a pixel in a variant 
map. Invariant coefficients provide ideal conditions for a practical measurement, it 
is feasible to check physical differences between an idea distribution and a practical 
measurement. 

From a quantitative viewpoint, multinomial expressions provide proper basis on 
corresponding partitions to be a relative measurement in representation. 


3.1 Case 1. Maps 


Using (5) — {("",?)(2)}, three maps are shown in Fig. 1 as 2D coefficients, 3D his- 
tograms, and 2D projections on four parameters m = {10, 11, 15, 16}, respectively. 


3.2 Case 2. Maps 


p 
2) C) conditions, each q selection determines a 2D array of coefficients. 
Under 0 < q < |m/2] conditions, |m/2] + 1 levels are required. For m = 10, it 
is necessary to have 6 levels. 

To observe global properties, a 3D color map is shown in Fig. 3 to illustrate 3D 


coefficients under color coding. 


Different from Case 1, each m is associated with one 2D coefficient. In ( ) > 


4 Result Analysis 


In maps of Figs. 1, 2, and 3, it is convenient to see variant maps transformed from 
elementary equations. From a certain viewpoint, {m — p} p} coefficients have sym- 
metric properties on horizontal direction on p : m — p with reflective properties. 
Nontrivial coefficients are located in 1/4 region of (m + 1)* square. An isosceles 
triangle is composed of all nontrivial coefficients. Selecting any m, there is only one 
2D coefficient associated with to be a unified distribution. 

{2q}{m — 2q} coefficients are corresponding to multiple 2D distributions under 
various q values. While q = 0, each nontrivial coefficient is located on diagonal 
position of p = k and each coefficient is a £1) A) equation. In 0 < q < 5 con- 
ditions, 2D coefficient matrices are shown in six groups of {0 : 10,2 :8,4:6,6: 
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Fig. 2 {m— p}{p} maps: m = {10, 11, 15,16}; (a1)—(di) m= 10; (a2)—(d2)m = 11; 
(a3)—(d3) m = 15; (a4)— (d4) m = 16 


4, 8 : 2, 10 : O}, this can be described as (x + y)"*! = (x + y)” (x + y)! coefficient 
distributions that can be illustrated in Fig. 2 {{(ao)—(co)} — {(as)—(cs)}} maps. 
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Fig. 3 {2q}{m—2q} maps: m= 10; (aọ)—(co)q =0 
(a3)—(c3)q = 3; (a4)—(ca)q = 4; (as)—(e5)q = 5 
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Fig. 4 {2q}{m — 2q} 
map:m = 10; 3D color map 


5.5 


4.5 


5 Conclusion 


It is a new exploration to use elementary equation to illustrate relevant variant maps. 
Based on the described model and calculation, it is convenient to do various analysis 
and visualization. It is an initial step to check two invariants from Type B for four 
variant measures. Further explorations are required on five levels of 11 nontrivial 
invariants in Type B. From results in this chapter, distinct distributions are observed 
on the two selected invariants. Other nine invariants in Type B will be discussed in 
future papers (Fig. 4). 
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Abstract Sequences of random variables play a key role in probability theory, 
stochastic processes, and statistics to analyze dynamic behavior. Speckle patterns 
have emerged as useful tools to explore space-time variations of random sequences 
in various measurement applications of comprehensive properties in complex space— 
time variation events. In this chapter, a variant map system is proposed to analyze 
statistical properties of random sequences in visual representations. An input 0-1 
sequence will be divided into multiple segments and each segment of a fixed length 
will be transformed into a 2-tuple pair of measures. Five measuring sets are identified 
and rearranged in a 1D or 2D numerical array as a histogram representing a visual 
map. These five types of maps consist of two types in 1D format as classical maps 
and three types in 2D format as variant maps. Properties are analyzed on all five 
types of maps. A cryptographic sequence of the AES cipher is selected as a sample 
stream. The five types of visual maps are generated and refined clustering character- 
istics are organized into four groups on changes of segmented and shifted lengths for 
visual comparisons on enlarged 2DP maps. Speckle patterns of various distributions 
are observed. Three variant maps with distinct statistic distributions could be useful 
to provide new visual tools to explore comprehensive cryptographic sequences on 
complex nonlinear dynamic behavior in global network environments. 
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1 Introduction 


Associated with network communication and internet technology [1] in global appli- 
cations, web communication, internet of things, cloud computing, big data, mobile 
phone, and smart wireless technologies [2] are significantly developed in the last 
decade and widely adapted over the world market. In the current situation, it is a key 
issue for cryptographic researchers and applications [3] to use advanced technologies 
of stream ciphers to protect data security of ultrafast and extra-big data streams in 
global network environments. 


1.1 Pseudo-Random Sequences 


1.1.1 From Linear Stream Ciphers 


Traditional stream ciphers [4] on LFSR Linear Feedback Shift Register structure (in 
military cryptography) are used as pseudo-random number generators, due to the ease 
of implementation from simple hardware, long periods, and uniformly distributed 
streams. The LFSR stream ciphers are the core in classical stream ciphers through 
the mathematical theory of algebraic functions for system simulation and analysis. 

However, an LFSR is a linear system leading to fairly easy cryptanalysis using the 
Berlekamp—Massey algorithm. Important LFSR-based stream ciphers use A5/1 & 
A5/2 in GSM cell phones and EO in Bluetooth. But the A5/2 cipher has been broken 
and both A5/1 and EO have serious weaknesses [5, 6]. 


1.1.2 From Nonlinear Stream Ciphers 


The new generation of stream ciphers [7, 8] are widely used in advanced web com- 
munications. Three general methods are applied to improve security weaknesses in 
LFSR-based stream ciphers: 


1. Nonlinear Functions: Nonlinear combination of several bits from the LFSR 
state [9]. 

2. Nonlinear Parts: Nonlinear combination of the output bits of two or more LFSRs 
or using Evolutionary algorithm for nonlinearity [10]. 

3. Clock Control: Irregular clocking of the LFSR, as in the alternating step gen- 
erator [11]. 


With batch, a series of nonlinear algorithms have emerged [12]: nonlinear equiva- 
lence [13], evolutionary methods [10], AES cipher [14], RC4 [15], ZUC [9], cellular 
automata [16], and nonlinear dynamic system [17]. 

The new generation of stream ciphers are being shifted from the traditional mode: 
LFSR [4] to various nonlinear modes: NLFSR [18, 19], clock control [11], nonlinear 
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functions [9] etc., it is essential for ciphers to be integrated and implemented [20] to 
satisfy security models. However, different from LFSR with well-established linear 
mathematical theories and simulation tools, it is extremely difficult to use advanced 
nonlinear mathematical theories, recursive models, descriptive tools, and implement- 
ing schemes [17] in nonlinear dynamic environments. 

How to evaluate cryptographic sequences generated from the nonlinear stream 
ciphers is an urgent problem for modern stream ciphers. 


1.2 Truly Random Sequences from Hardware Devices 
and Speckle Patterns 


In addition to pseudo-random sequences generated by stream ciphers, high-quality 
stochastic oscillators of truly random sequences are generated from special hardware 
devices such as laser photonics [21], nonlinear optics [22], quantum optics [23], 
quantum noises [24], thermal noise [25], chaos, and fractal nonlinear dynamics [26]. 

A list of truly random number generators are developed to extract stochastic 
information from speckle patterns [27], i.e., random bits from turbulence [28] to get 
random numbers from the speckle positions, generation of random arrays using laser 
speckle [29], 2D generation of random numbers by multimode fiber speckle [30], 
Markov speckle for efficient random bit generation [31] and dynamic laser speckle 
and applications [11]. 

Since various truly random sequences are created from specific physical models 
with special principles and uncertain methodologies, it is extremely difficult for 
cryptographic researchers to make proper measurements explore nonlinear dynamic 
properties. 


1.3 Statistic Testing Packages on Cryptographic Sequences 


Randomness has been explored for many years [32] on a series of statistic testing 
theories and methods. The NIST 800-22 testing package [33] is an effective statis- 
tic package on random sequences collecting a set of 16 statistic testing schemes 
in evaluations of statistic properties on cryptographic sequences. Statistic testing 
packages are very useful to catch a list of quantitative measurements evaluating 
randomness properties of cryptographic sequences in wider applications. However, 
testing schemes in various packages are mainly focused on P-value or a list of static 
properties of a testing sequence. 

Since comprehensive behaviors in nonlinear dynamics may increase computa- 
tional complexities tragically to involve complicated dynamic properties in the mul- 
tivariate environment, those dynamic behaviors are completely ignored. 
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1.4 Gaussian Distribution and Speckle Pattern 


Multivariate normal probability distribution models are the most important and pow- 
erful tools that are used to test stochastic characteristics of a random data sequence 
[34] under the framework of probability, stochastic process, and statistics [35] for 
nonlinear problems. In this kind of measuring models, when the data sequence is 
sufficiently long, the high-dimensional probability distribution of the sequence [36] 
is similar to the continuous Gaussian distribution. 

A typical projection model is shown in Fig. la; the central part shows a Gaus- 
sian surface with an unbalanced distribution in a 2D plane distributed as P(X, Y) 
measures with pseudo-colors and its two 1D projections shown in both horizontal 
P(X) and vertical P(Y) planes, respectively. In Fig. 1b, a standard Gaussian surface 
with symmetric shapes is illustrated and the 2D projection of its pseudo-color map 
is shown in Fig. 1c with an ideal continuous distribution of color on the map. Dif- 
ferent from ideally continuous distributions, in Fig. 1d, a real image generated from 
the Laser speckle phenomena [37] is illustrated as an objective speckle pattern [38] 
scattered by a laser beam from a plastic surface onto a wall. It is convenient for us 
to compare different color maps in Fig. Ic, d, respectively. 

From these set of figures, the relationship between the projection curve and two 
1D Gaussian distributions can be observed in the multivariate normal probability 
environment. Multivariate Gaussian probability distributions may support classical 
schemes to analyze complex stochastic data sets of measuring sequences in many 
applications in continuous conditions. But speckle patterns in Fig. 1d provide intrinsi- 
cally discrete random patterns that may not be easily simulated by smoothed Gaussian 
map in Fig. 1c, further exploration on proper simulation and control mechanisms are 
required. 


1.5 Controlling Deterministic Chaos 


Controlling deterministic chaos has been an active R&D field in nonlinear dynam- 
ics over the past decades. From the pioneering work, significant progress has been 
achieved in control spatiotemporal chaos [39], plasma device, laser systems [40], 
chemical reactions, and biological systems both spatial and temporal dependence 
considered. The complex Ginzburg—Landau equation (CGLE) system [41] describes 
universal dynamics features near a supercritical Hopf bifurcation. It exhibits defected 
mediate turbulence or spiral turbulence in a wide parameter region. The control by 
generating a spiral wave seed has been described [42, 43] to grow into a stable spiral 
in the CGLE system. 

Systematic approaches on simulation of nonlinear behaviors, speckle phenomena 
in optics [37] and pattern dynamics [44] have been actively explored. 


Variant Map System of Random Sequences 109 


(b) (c) (d) 


Fig. 1 Multivariate Gaussian Probability Distributions and an objective speckle pattern; a Bivariate 
normal distribution with two probability projections; b A symmetric bivariate normal surface with 
pseudo-colors; c A 2D pseudo-color map of the symmetric bivariate normal surface; d An objective 
speckle pattern scattered by a laser beam from a plastic surface onto a wall. [38] 


1.6 Poincaré Map 


From a measuring viewpoint, spatial variations of a stochastic sequence will be 
changed by overall macro characteristics showing statistic measurements of dis- 
tributed patterns [45] in a vector space, so that a random sequence is measured by an 
analytic space. From an analysis viewpoint, the Poincaré section [46] corresponds to 
a discrete map proposed by the eminent French scientist Henri Poincaré 100 years 
ago. 
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The Poincaré map handles additional information from sequential changes of 
ordered measurements in the phase space of classical dynamics, nonlinear dynamic 
systems [47] and chaos. 

The mapping mechanism of the Poincaré map may be useful to handle dynamic 
patterns on cryptographic sequences of stream ciphers. This mapping scheme has 
been applied to observe the global randomness of cellular automata sequences on 
2D maps [48] 20 years ago. 


1.7 Variant Framework 


Various schemes following the top-down strategy are explored to use multiple mea- 
sures to partition special phase spaces from a top state set to multiple bottom states 
via multi-levels of a hierarchy in combinatorial algorithms [49], image analysis and 
processing for many years. 

The conjugate classification [50] is proposed to apply seven measures in a hier- 
archy to partition the kernels of four regular plane lattices on n = {4, 5, 7, 9} cases 
for 2D binary images. For 1D cellular automata sequences, global random behaviors 
[48] are visualized in 2D maps. 

For n-tuple bit vectors, the variant logic framework [51] was proposed and var- 
ious applications were explored: 3D visual method on random number sequences 
[52], variant Pseudo-Random Number Generator PRNG [53, 54], computational 
simulation on quantum interactions [55, 56], noncoding DNA analysis [57] and bat 
echolocation [58]. 


1.8 Proposed Scheme 


For the purpose of system characterization based on comprehensive measurements 
of cryptographic sequences, we propose a variant map system for a 0—1 stochastic 
sequence with length M. Multiple segments M are divided from the sequence by a 
given length m. A 2-tuple pair of measures can be extracted from a 0-1 segment that 
is the number of a single element and the number of 01 patterns in the segment. All 
paired measures are composed of a sequence of M pairs of measures as an ordered 
measuring set with M elements. 

The pairs of the measuring sequence are directly separated into two independent 
measuring sequences to keep each parameter in the same order. Applying the pairing 
scheme of the Poincaré section, one single measuring sequence can be reorganized by 
two consequent measures as a 2-tuple pair of measures. Two measuring sequences in 
the Poincaré section and the original pairs of measuring sequence are arranged as the 
three sequences of 2-tuple measures. So a total of five sequences of distinct measures 
are constructed including two sequences on single measures and three sequences on 
2-tuple measures. 
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Following this approach, two sets of single measuring sequences are sorted as 
two 1D numerical arrays as statistical histograms being classic 1D maps and three 
sets of 2-tuple measuring sequences are sorted as three 2D integer arrays as statistic 
histograms being three variant maps. Under the controlling operations on the changes 
of the segment lengths and shift displacements, multiple results of the five measuring 
sequences are transformed into 1D statistic histograms and 2D pseudo-color maps 
to show effective speckle patterns from the selected cryptographic sequence under 
various conditions of the combination on the two controlling parameters. 


1.9 Organization of the Chapter 


This chapter describes the variant map system in diagrams of the system architecture 
and the core modules with input/output and processing functions in Sect. 2. In Sect. 
3, the relationships among measuring sequences and the five statistical distribution 
maps are analyzed. In Sect. 4, an AES cipher sequence is selected to form a series 
of statistical maps based on changes of the two control parameters. From the results 
of the visual maps in Sect. 4, intuitive analysis and brief comparisons are carried out 
in Sect. 5. Finally, in Sect. 6, the main results are summarized. 


2 Framework of Variant Map System 
2.1 Framework 


For the variant map system, the block diagrams of the system framework and the core 
modules of the system are shown in Fig. 2. The framework of the system architecture 
in Fig. 2a is composed of three core modules: the Shift Segment Measurement SSM, 
the Measuring Sequence Combination MSC, and the Projective Color Map PCM. 
The three modules are shown in Fig. 2b—d in more detail, respectively. 


2.2 Shift Segment Measurement SSM 


The SSM module is shown in Fig. 2b. 
Let X be a 0-1 vector with N elements as an input sequence, 


X = X[O)X[1]---X[7]---X[N—-1],0 < Z < N; X[I] € {0, 1} (1) 
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Fig. 2 The framework of the variant map system for cryptographic sequences; a The system 
architecture; b The SSM module; c The MSC module; d The PCM module 


The SSM module consists of two processing units: the Vector Shift VS and the 
Segment Measurement SM, respectively. The two input control parameters: {r, m} 
are defined as shift length r and segment length m. 

Let Y be a 0-1 vector with N elements, this vector is generated by the shift 
operation under the loop displacement condition from the input sequence (i.e., a 
cyclic shift right + or shift left —) 


Y = X (r), YU) = XU +r], I +r(mod N),O < I < N; X[I], Y[I] € {0, 1}@) 


The shifted vector is inputted into the SM unit for a segmentation process. 
The input sequence will be divided from a long sequence with N elements into 
M = |N/m] segments as a set of sub-vectors with m elements and each segment 
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contains m bits. The i-th sub-vector 0 < i < M on the j-th position 0 < j < m is 
denoted as Y; j. 

This sequence of sub-vectors after the segmenting operation forms the follow- 
ing m x M matrix, m positions for the i-th complete row vector in the sequence 
correspond to a pair of 2-tuple measures: (p;, qi), and incomplete parts of the last 
sub-vector are ignored. 


Yoo Yor + Yoj +++ Yom- (Po, qo) 
y= Yio Ja = Vigo = Jimai a (pi. qi) (3) 
Ym-1,0 Ym-1,1 ++} Yu—1,j ++ YM—1,m-1 (Pm-1,qm-1) 
= {(pi, gio" 


The pair of 2-tuple measures (p;, qi) is determined by the following formula: 


Y; ; = Y[J] € {0,1}; J=ixm+j, 


0<i<M,0<j<m,0<J<mxM<N (4) 
m—1 

pi = Yo Yij, Yaj € {0,1},0 < pi < m; (5) 
j=0 


m—1 


qi = DNAT Yi j) == (0, 1)], j — 1(mod m), O < qi < |m/2]; (6) 
j=0 


i.e., X = 0011010010, N = 10, M = 2, m = 5; (po = 2, qo = 1); (pı = 2, 
qı =2). 

The parameter p; is the number of single elements in the i-th sub-vector, the 
parameter q; is the number of 01 pattern overlapped in the i-th sub-vector in a cyclic 
condition. For any segment m > 0,0 < p; < m,0 < qi < |m/2], all segments are 
transformed from a random sequence with N elements into a measuring sequence 
with M elements. 


The SSM module outputs the ordered pairs of 2-tuple measures { p;, qi ea 


2.3 Measuring Sequence Combination MSC 


The MSC module is described in Fig. 2c, the module is composed of two units: the 
Measuring Split MS and the Measuring Combination MC. The MS unit processes 
the SSM module’s output, and splits the measuring sequence with 2-tuple measures 
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into two independent measuring sequences: { an a , (CAH l to keep the original 
measuring number invariant. 

Recombining each single measuring sequence by overlapping consequent ele- 
ments as a pair, the MC unit will form two independent measuring sequences orga- 
nized in 2-tuple measures: {p; | —> {(pi-1, pD ar and {qi} is —> 
{(qi-1, qD 0 ',i — 1(mod M) to provide appropriate sequences for subsequent pro- 
cessing modules. 

The MSC module produces tl a following four measure sequences: 


(pity! aD", (pi-i; POD, ai- gi he p » respectively. 


2.4 Projective Color Map PCM 


The PCM module consists of two units: PA,CM. For five measuring sequences, 1D 
and 2D measures will be processed separately. 

The PA unit processes relevant measuring sequences to transform them into integer 
arrays and the CM unit will visualize these on either normalized histograms (1D 
measures) or color maps (2D measures), respectively. 


2.4.1 1D Measures 


The 1D measures involve two measuring sequences: Ann 5 ‘fat Mo G4 Let 
P[m +1], Q[|m/2] +1] and NP[m + 1], NQ[|m/2] + 1] be two ID p n 
float) arrays to o the corresponding elements, which are defined in the 
following. 


2.4.2 1DP Map 


The 1DP statistic histogram: for a sequence { pi} re NP, P are two arrays (float, 
integer) with (m + 1) elements. The j-th elements N P[j], P[j],0 < j < m, can be 
obtained from the following procedure: 


Initialization: VN P[j] = 0.0, P[j] =0,0 < j < m; 
Calculation: for(i = 0;i < M; i + +{P[pi] + +; } 
Normalization: for(j = 0; j < m; j ++){NP[j] = P[j]/M; } 


In the 1DP map, the PA unit corresponds to Initialization and Calculation; the CM 
unit handles Normalization. 
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2.4.3 1DQ Map 

The 1DQ statistic histogram: for a sequence {q; a4 NQ, Q are two arrays 
(float, integer) with ((m/2] +1) elements. The j-th elements N Q[J], QL/], 
0 < j < |m/2], can be obtained from the following procedure: 


Initialization: VN O[j] = 0.0, Q[j] = 0,0 < j < |m/2]; 
Calculation: for(i = 0;i < M; i + +){Q[qi] + +; } 
Normalization: for(j = 0; j < |m/2]; j ++){NOLj] = OLj]/M: } 


Using P, NP, Q, NQ arrays, it is possible to generate the corresponding 1D 
statistical histograms as 1D maps. 

In the 1DQ map, the PA unit corresponds to Initialization and Calculation; the 
CM unit handles Normalization. 


2.4.4 2D Measures 


The 2D measures specially process three measuring sequences: {(pj-1, POS » 
(git, tg! (Pis a Mg!. Let Pim + 1:m +1], Qllm/2] +1: (m/2] + 1), 
PQ[m +1: |m/2] + 1] be three 2D integer arrays to represent the corresponding 
elements, which are defined in the following. 


2.4.5 2DP Map 


2DP statistic histogram: for a sequence{(p;-1, pi)} = 0 1P isa 2D integer array with 
(m + 1)? elements. The i, j-th elements P[i, j], 0 < i, j < m, can be obtained from 
the following procedure: 


Initialization: VP[i, j] = 0,0 < i, j < m; 
Calculation: P[pm-1, po] + +; 
for(i =1;i < M; i + +){P[pi-1, pil + +; } 
Pseudo-color: Matching proper color VP[i, j], O < i, j <m 


In the 2DP map, the PA unit corresponds to Initialization and Calculation; the CM 
unit handles pseudo-color. 
2.4.6 2DQ Map 
2DQ statistic histogram: for a sequence {(q;_1, qD Pal Q is a 2D integer array 


with (Lm/2] + 1)? elements. The i, j-th element Q[i, j], 0 < i, j < |m/2], can be 
obtained from the following procedure: 
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Initialization: VQ[i, j] = 0,0 <i, j < |m/2]; 
Calculation: OQ[gu-1, go] + +: 
for(i = 1;i < M; i + +){Qlqi-1, qi] + +; } 
Pseudo-color: Matching proper color VQ[i, j],0 < i, j < |m/2] 


In the 2DQ map, the PA unit corresponds to Initialization and Calculation; the 
CM unit handles Pseudo-color. 


2.4.7 2DPQ Map 


2DPQ statistic histogram: for a sequence {(p;, qD re PQ is a 2D integer array 
with (m + 1) x ({m/2] + 1) elements. The i, j-th elements P Q[i, j],0 <i <m, 
0 < j < |m/2], can be obtained from the following procedure: 


Initialization: VP Q[i, j] = 0,0 <i <m,0< j< |m/2]; 
Calculation: for(i = 0;i < M; i + +{P Q[pi, qil + +; } 
Pseudo-color: Matching proper color VP Q[i, j], 0 <i <m,0< j < |m/2] 


In the 2DPQ map, the PA unit corresponds to Initialization and Calculation; the 
CM unit handles Pseudo-color. 

Through the PCM module, five measuring sequences are transformed into two 
1D arrays and three 2D arrays with (m + 1), ((m/2] + 1), (m+ 1)’, (Lm/2| + 1)? 
and (m + 1) x ({m/2] + 1) clusters, respectively. 

The final results of the variant map system are five maps: 1DP, 1DQ, 2DP, 2DQ, 
and 2DPQ as expected statistic distributions of the input 0-1 sequence. 


3 Sequence Analysis 


3.1 Ideal Condition 


From a viewpoint of sequence analysis, it is a classical technology to sort the { p; } o 


measuring sequence as a 1D statistic histogram. When the measuring sequence meets 
ideal conditions, the 1D statistical distribution is a binomial distribution. 


Lemma 1 For an input 0-1 sequence, if the total number of segments is equal to 
M = 2", and each segment of m bits appears only once in the sequence, then the 
IDP array satisfies the binomial distribution: 


m 
pil=(").o<iem (7) 


I 
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Corollary 1 /f the input sequence meets the conditions of Lemma 1, then the total 
number of items in the 1DP array is equal to 


m 


y P[i] = 2” =M (8) 
i=0 


Lemma 2 Jf the input sequence meets the conditions of Lemma 1, then the 1IDQ 
array satisfies the following relation: 


ori =2( J0 5i = 1m2) (9) 


Corollary 2 Jf the input sequence meets the conditions of Lemma 1, then the total 
number of items in the 1DQ array is equal to 


m/2 


ys oli] = 2" = M (10) 


3.2 General Condition 


Theorem 1 For any 0-1 sequence with N elements, a 2DP array has two projections 
in both vertical and horizontal directions and they are corresponding to the 1DP 
array. 


Proof A 2DP array is generated from a measuring sequence {(p;-1, Poe 
and the 2DP array is {P[i, j]}7Xo "9, from both directions P[i] = Pio P{i, j], 
0<i<m; P[j]= vy o PIi, jl, 0< j <m; so{P[i]l} o = = {P[j]}_o- Both pro- 


jections are the same 1DP array. 


Corollary 3 For an arbitrary input sequence, the total number of items in the 2DP 
array is equal to 


m m m 


Y yY Pig > Pl (11) 


i=0 j=0 i=0 


Theorem 2 For any 0-1 sequence with N elements, a 2DQ projection in both direc- 
tions is the 1DQ array. 


Proof A 2DQ array is generated from a measuring a {qi-1; qili =0 l and the 


2DQ array is a riled aes from both directions Q ye Oli, j],0< 


i < [m/2]; QL] = Vico Oli, /1,0 < j < |m/2]; so = yi = = {Oj pieg. 


Both iis ie are the same 1DQ array. 
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Corollary 4 For an arbitrary input sequence, the total number of items in the 2DQ 
array is equal to 


Lm/2] |m/2] [im /2] 


> >) Ol. l= D5 Olil=mu (12) 
i=0 


i=0 j=0 


Theorem 3 For any 0-1 sequence with N elements, a 2DPQ projection in two direc- 
tions is corresponding to either a 1DP array or a 1DQ array, respectively. 


Proof A 2DPQ array is generated from a measuring seo {Di, qi} ny and the 
2DPQ array is = iIo aa from two directions P[i =p 3 a P Oli, j], 

0 <i <m; O[j] = yy POLli, jl, 0 < j< |m/2].So the two projections are cor- 
responding to ne a IDP or a 1DQ array. 


Corollary 5 For an arbitrary 0-1 input sequence, the total number of items in the 
2DPQ array is equal to 


m |m/2] m Lm/2] 
YO Poli ji= M=} Pils X. olj (13) 
i=0 j=0 i=0 j=0 


Corollary 6 For an arbitrary input sequence, five measuring sequences are corre- 
sponding to two 1D and three 2D arrays. Let |G | denote the number of associated pos- 
sible clusters in G. Ifm > 3, then |2DP| > |2DPQ| > |2DQ| > |IDP| > |1DQ| 
is satisfied. 


Proof Five arrays: (2DP,2DPQ, 2DQ,1DP,1DQ) contain {(m + 1)*, (m + 1) x 
(Lm/2] +1), (lm/2| + 1)?, (m + 1), (Lm/2] + 1)} items, respectively. If m > 3, 
then the inequalities are true. 


3.3 Brief Discussion 


From the listed statement in lemmas, theorems, and corollaries, Lemmas | and 2 
described an ideal input sequence where each segment is a uniform distribution 
which appears only once. Under this ideal condition, both 1DP and 1DQ arrays are 
corresponding to a binomial distribution. Corollaries 1 and 2 have shown that both 
1DP and 1DQ arrays meet the number of quantitative characteristics for the ideal 
input sequence. 

Theorems | and 2 establish projective conditions on any input sequence. A 2DP 
or 2DQ array has its 1D projection of two directions on the same array. Theorem 3 
claims that for any 2DPQ array, two projections are corresponding to both 1DP and 
1DQ arrays, respectively. 

Corollaries 3 and 4 treat 2DP and 2DQ arrays, respectively, in the total number of 
summing conditions on their quantitative characteristics. Corollary 5 is associated 
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with Theorem 3 on a 2DPQ array to share with other four projections the same 
quantitative characteristics. In Corollary 5, the total number of each component on 
five statistic arrays is equal to the total number of segments M , a2DPQ array occupies 
a central position in the projection to other arrays. Corollary 6 uses inequalities to 
show five scales of numbers of items in five arrays to provide the maximal number 
of items involved in the structure. 

From a viewpoint of complex stochastic sequence analysis, this partition mode 
corresponds to the maximum number of clusters distinguished in the condition of 
multiple segments. Different from surface analysis based on the multivariate Gaus- 
sian probability distribution, variant maps provide only a limited finite number of 
lattice points that form space-related clusters on the projection position. Under the 
condition of segments in larger length, the 2DP array has the maximum number of 
distinct items and can be clearly distinguished among the five arrays to make the 
most visible map showing the largest refined distribution in details. 


4 Sample Maps 


Since the ideal distribution may appear merely on specific conditions, it is very 
difficult to use algebraic formulas to describe measuring sequences on statistical maps 
of an arbitrary cryptographic sequence. For complicated data sequences, the most 
effective scheme is using the computational approach directly to generate relevant 
maps and then to make feasible comparisons. Among the five maps generated from 
an input 0-1 sequence, more 2DP maps are selected in this section to illustrate a 
series of changes among segment lengths and shifting lengths for refined details. 

In this section, one cryptographic sequence generated from an AES cipher is 
selected as a sample sequence, and various control parameters will be changed. This 
sample sequence has a fixed length N = 10° in one million stochastic bits. Various 
changes are made on the length m of segment and shift displacement r. Five maps 
will be applied to show their special statistical distributions. 


4.1 Dramatically Changing the Segment Lengths: IDP, 1DQ, 
2DP, 2DQ, and 2DPQ Maps m = {8, 16, 128}, r = 0 


Three groups of Figs. 3, 4, and 5 are involved in comparison based on the five maps. 
In Fig. 3, nine maps from both IDQ and 2DQ forms are selected in 
m = {8, 16, 128}, = 0 condition; (a)-(c) showing three 1DQ maps with differ- 
ent segments; (d)—(f) showing 2DQ maps in normal sizes and (g)—(i) being the same 
2DQ maps with enlarged sizes. 
In Fig. 4, 12 maps from IDP, 2DPQ, and 1DQ forms are selected in 
m = {8, 16, 128}, = 0 condition; (a)-(c) showing three 1DQ maps with differ- 
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(b) m=16 (c) m=128 


o m=128, w 


2DQ 


2DQ 
o 


(g) m=8 (h) m=16 (i) m=128 


Fig. 3 1DQ and 2DQ maps on m = {8, 16, 128}, r = 0; a-c 1DQ maps; d-f 2DQ Regular maps; 
g-i 2DQ Enlarged maps 


ent segments; (d)-(f) showing 2DPQ maps in normal sizes; (g)—(i) being the same 
2DPQ maps with enlarged sizes and (j)-(1) illustrating 1DQ maps for convenient 
comparison. 

In Fig. 5, nine maps from both IDP and 2DP forms are selected in 
m = {8, 16, 128}, r = 0 condition; (a)-(c) showing three 1DP maps with different 
segments; (d)—(f) showing 2DP maps in normal sizes and (g)—(1) being the same 2DP 
maps with enlarged sizes. 


4.2 Small Changes in Segment Lengths: 2DP Maps; 
Variation Series in Lengths of Segments 
m = {125, 126, 127},r =0 


Two groups of maps are compared in Fig. 6 based on slightly changing segment 
lengths. 
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on 128. 


As 
A 
(c) m=128 
O 2 
x 7 
A, 
(£) m=128 
a 
Às a 
(a 
a. 
(g) m=8 (h) m=16 (i) m=128 
Oo. | ii 
=o J ği I 
G) m=8 (k) m=16 (1) m=128 


Fig.4 1DP,2DPQ, and 1DQ maps on m = {8, 16, 128}, r = 0; a—c 1DP maps; d-f 2DPQ Regular 
maps; g-i 2DPQ Enlarged maps; j-l 1DQ maps 


In Fig. 6, nine maps from both 1DP and 2DP forms are selected in 
m = {125, 126, 127}, r = 0 condition; (a)-(c) showing three 1DP maps with dif- 
ferent segments; (d)—-(f) being 2DP maps in normal sizes and (g)—(i) showing the 
same 2DP maps with enlarged sizes. 
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Ae n 
Ai 

(b) m=16 (c) m=128 
a . 
N 

(£) m=128 

As 
Q 
N 


(i) m=128 


Fig. 5 1DP and 2DP maps on m = {8, 16, 128}, r = 0; a-c 1DP maps; d-f 2DP Regular maps; 
g-i 2DP Enlarged maps 


4.3. Changing the Lengths of Shift Displacement: 2DP Maps 
Change on Displacement Series m = 128, r = {1, 2, 8} 


Two groups of maps are compared in Fig. 7 under changing shift lengths. 

In Fig. 7, nine maps from both 1DP and 2DP forms are selected in m = 128, 
r = {1, 2, 8} condition; (a)-(c) showing three 1DP maps with different segments; 
(d)-(f) being 2DP maps in normal sizes and (g)—(1) showing the same 2DP maps 
with enlarged sizes. 
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a 
9 

(a) m=125 (b) m=126 (c) m=127 
a Ka . . 7 9 

(d) m=125 (e) m=126 (f) m=127 
x 
a . | . 
el : 


(g) m=125 (h) m=126 (i) m=127 


Fig.6 1DPand2DP maps on m = {125, 126, 127}, r = 0; a—c 1DP maps; d-f 2DP Regular maps; 
g-i 2DP Enlarged maps 


4.4 Enlarged Maps: 2DP Maps on m = {125, 127, 128}, 
r = {0,8} 


1DP maps are selected in both Figs. 8 and 9 on enlarged forms. 

In Fig. 8, four maps from the 2DP form are selected in m = {125, 127, 128}, r = 
{0, 8} condition; (a) r = 0, m = 125; (b) r = 0, m = 127; (c)r = 0, m = 128, and 
(d) r = 8, m = 128. Four maps are showing the same 2DP maps on enlarged sizes. 

In Fig. 9a and b, two maps of speckle patterns are selected from two distinct 
resources for comparison. (a) a larger map from the 2DP form is generated in m = 
128, r = 0 condition; (b) a larger map of Fig. 1d is illustrated for a laser beam 
reflected from a plastic surface onto a wall. It is convenient for readers to observe 
the two speckle pattern maps in refined details. 
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2DP 
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. 
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(e) r=2 (£) r=8 
$ +. ) | | * , 
Q « h “ Uk 
(g) r=1 (h) r=2 (i) r=8 


Fig.7 1DP and 2DP maps on m = 128, r = {1, 2, 8}; a-c 1DP maps; d-f 2DP Regular maps; g-i 
2DP Enlarged maps 


5 Result Analysis 


5.1 Figures 3, 4. and 5 


In Figs. 3, 4, and 5, six maps are listed on both 1DP (Figs. 4 and 5a—c) and 1DQ 
(Figs. 3a—c and 4j-1) forms, their distributions are generally corresponding to bino- 
mial coefficients. Under the changes of different lengths on segments, 1D maps are 
showing distributions of binomial patterns in the symmetric bell curves with the 
maximal value on the middle area. 

From Figs. 3 and 5, six 2DQ maps (Fig. 3d—i) and six 2DP maps (Fig. 5d—i) are 
listed, when m = {8, 16}, significant regular distributions along both horizontal and 
vertical directions (Figs. 3d—h and 5d—h) appear as symmetric patterns. The central 
cluster is collected the largest number of measures located on the center point of 
relevant maps. But checking maps in Figs. 3f-i and 5f-i, regular patterns with the 
central symmetry are severely destroyed when the length of segments is increased to 


Variant Map System of Random Sequences 125 


2D_p_r=0_ m=125 S 2D_p_r=0_m=127.S 


80 80 
240 20 
225 195 
75 75 
210 180 
195 165 
70 180 70 PH 
165 
150 135 
65 35 65 120 
120 105 
60 105 6o 90 
90 75 
55 75 55 60 
60 
45 
45 
50 30 50 30 
15 15 
45, o 45, 0 
45 50 55 ao 65 70 75 80 45 50 55 60 65 70 75 80 
(a) r=0, m=125 (b) r=0, m=127 
ý 2D_p_r=0 m=128 S én 2D_p_r=8 m=128 S 
210 210 
195 195 
75 75 
180 180 
165 165 
70 70 
150 150 
135 135 
65 120 ad 120 
105 105 
50 90 60 90 
75 75 
55 60 55 eo 
45 45 
50 30 50 30 
15 15 
9 = E E O E O o ssb e mm m m a A o 
45. 50. 55 6O 65 70 7s 80 as 50 55 60 6S 70 75 80 
(c) r=0, m=128 (d) r=8, m=128 


Fig. 8 2DP larger maps on m = {125, 127, 128}, r = {0, 8}; a r =0,m = 125 map; b r = 0, 
m = 127 map; c r = 0, m = 128 map; dr = 8, m = 128 map 


m = 128. Regarding the two maps in Figs. 3f and 5f, both maps show circular disks 
with the central position at the highest number of collected measures. However, the 
two enlarged maps in Figs. 31 and 5i clearly show that significant speckle patterns 
are visualized around the central areas with stochastic higher numbers of measures. 
By comparing the two maps in Figs. 3i and 5i, Figure 5i provides much more visible 
asymmetry than Fig. 3i. 

Because a 2DQ map covers only a quarter of a 2DP map, the damaging ratio of 
its symmetric properties appears much weaker than on the 2DP map. Applying a 
sufficiently larger segment length, central areas are observed with random speckle 
patterns and visible symmetric properties significantly damaged. 

In general, it is feasible for a 2DP map to observe its middle areas in an approx- 
imately rotational symmetry in small sizes. But when the segment length is big 
enough, significant speckle patterns emerge in the central area with stronger stochas- 
tic properties. 
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Fig. 9 Speckle patterns in 
enlarged maps of the 2DP 

form; am = 128,r =0;b 
m = 128,r = 8 


45 50 55 60 65 70 75 80 


as 50 55 6o 65 70 735 80 


In the 2DPQ maps of Fig. 4d-i, when m = {8, 16}, there appears a single central 
point as a key cluster to collect the maximal number with visible symmetrical patterns 
on the horizontal direction, but without symmetrical pattern on the vertical direction 
in Fig. 4d—h. However, when m = 128, the 2DPQ map of Fig. 4f appears as an 
irregular disk with higher values in the central area. 

From the 2DPQ map of Fig. 41, the enlarged map shows that stochastic speckle 
patterns appear in the central area with better horizontal symmetry than vertical 
direction with significantly damaged details. 
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5.2 Figure 6 


In Fig. 6a—i, the nine maps are listed to show small changes on lengths of segments 
m = {126, 127, 128}. By checking the three 1DP maps in Fig. 6a—c, three middle 
areas appear slightly different from the bell shape: (a) left is higher than right; (b) 
right is higher than left; (c) right is higher than left and the middle one is lower than 
its nearest neighbors. 

The three 2DP maps in (d)-(f) appear significantly as circular disks with an 
approximate symmetry and higher clusters around central areas. In the three enlarged 
2DP maps in (g)-(i), there appear various speckle patterns in central areas. 

Comparing the six maps of (a)—-(c) and (g)—-(i), speckle patterns in the three 2DP 
maps (g)—(i) are much easier identified than broken curving patterns in the three 1 DP 
maps (a)-(c). 


5.3 Figure 7 


In Fig. 7a-i, the nine maps are listed to analyze changes of the parameters m = 
128, r = {1, 2, 8}. By checking the three 1DP maps in Fig. 7a—c, middle areas of 
three maps appear slightly different from the regular bell shape: (a) left is lower than 
middle and middle is equal to right; (b) left and right are lower than middle, and right 
is higher than left; (c) left-middle-right are equal. 

The three 2DP maps in (d)—(f) appear as similar circular disks with an approximate 
symmetry and higher clusters around central areas. In the three enlarged 2DP maps 
(g)—(i), there are various speckle patterns distinguishably placed in central areas. 

Comparing the six maps of (a)—(c) and (g)—(i), distinguishable speckle patterns in 
the three 2DP maps (g)—(i) are much easier identified than broken curving patterns 
in the three 1DP maps (a)-(c). 


5.4 Figures 8-9 


In Fig. 8a—d, four enlarged 2DP maps are listed by using the parameters m = 
{125, 127, 128}, r = {0, 8}. Three maps (a)-(c) are created with m = {125, 127, 128}, 
r = 0 and two maps (c)-(d) with m = 128, r = {0, 8}. Four larger 2DP maps in 
(a)-(d) show stronger speckle patterns distinguishable in their central areas with 
significant distributions identified differently from mixed reflection and rotational 
effects. 

In Fig. 9a—b, two enlarged maps of speckle patterns are selected. The map (a) with 
m = 128,r = 0 provides refined details to illustrate stochastic speckle patterns in 
the central area and the map (b) with m = 128, r = 8 has the same segment length, 
but a different shift length. The highest color clusters of the map (b) appear more 
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compact and simpler than the highest color clusters of the map (a). The two maps are 
showing different speckle patterns as a result of simple geometric transformations. 

By comparing the two enlarged speckle pattern maps, significant similarities and 
differences in details could be recognized. 


6 Conclusion 


For any 0-1 sequence with N elements, the variant map system processes multiple 
segments to transform each segment in a pair of measures. Using the cryptographic 
sequence generated from the AES cipher, five statistic maps were created. Two 1D 
maps show binomial distributions to which we refer as classical maps. Three 2D maps 
are constructed as variant maps. Selecting smaller segmented lengths, both classical 
and variant maps were illustrated in four groups. With larger segmented lengths 
increased, there are significant speckle patterns observed. From a brief comparison 
of the two larger maps, the enlarged 2DP maps in Fig. 9a, b show better refined visual 
details than other smaller maps. 

For the 2DPQ map, there are significant horizontal symmetries observed, however, 
there is no reflection effect in the vertical direction. 

From different 2DP maps with parameters m = {125,..., 128}, significant 
changes are observed: various speckle patterns are developed by both changes 
between lengths of segments and shift displacements. Enlarged maps are conve- 
nient to illustrate stochastic speckle patterns visibly. Some significant clusters are 
collected with speckle patterns associated to different control parameters in relevant 
maps. 

From a viewpoint of system operation, two types of control parameters: length of 
segments and shift length of the sequence, provide an effective control mechanism 
to form clear speckle patterns on 2D distributions. It is necessary for us to put more 
attention on systematically exploring this type of issues, for refined researches on 
further directions. 

The variant map system is different from both technologies: extracting information 
of speckle patterns to form random sequences and NIST 800-22 statistic testing 
package to use a single measurement of a P-value or a list of static parameters 
for evaluation. The variant framework provides five maps to identify complicated 
measurements through speckle patterns in details for any cryptographic sequence. 
Three refined 2D maps have more accurate properties than two 1D maps to describe 
nonlinear dynamic behavior as possible quantitative measurements. 

In relation to the variant map system, future explorations on both theoretical 
foundation and key applications on cryptographic sequences are urgently required. 
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of Six Random Sequences on Variant geret 
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Abstract Various random streams have different stationary properties. It is 
necessary to use statistical probability and time series to evaluate quality of station- 
ary randomness. In this chapter, a testing model is used on three maps for a random 
sequence. Multiple segments are divided on the shifted sequence as three measuring 
sets. For a map, the maxima are extracted and three maximal values are identified. 
2D maps represent stationary randomness. Conditions of station random/stationary 
sequences are investigated. Testing sets are collected from three types of six ran- 
dom resources: AES, DES, A5, RC4, Australian National University (ANU), and 
University of Science and Technology of China (USTC) (two block ciphers, two 
stream ciphers, and two quantum ciphers). Six random sequences are selected. Mea- 
surements of stationary randomness are compared. There are only 0.0034—4.27% 
differences that are recognized. Using variation ratios, six samples are composed 
of three variation categories on { AES, DES}, {A5, RC4}, and { ANU, USTC}, re- 
spectively. From a measuring viewpoint, all six samples are showing distinguished 
stationary randomness properties. 
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1 Introduction 


In modern cyberspace environment [1], network communication technologies play 
the essential role to support advanced developments of science, technology, and social 
daily life in every aspect. From a security viewpoint of network communication, 
Communication Security (COMSEC) systems [2] are the most important part. Every 
COMSEC system depends on block cipher/stream cipher/hash technologies, and 
its core component is linked to a random number generator for any cryptographic 
applications. 

Quantum satellite [3] using Quantum Key Distribution (QKD) systems [4] in 
cryptographic applications is the most advanced ICT development to establish ultra- 
secure quantum communications. For a QKD system, a truly random number gen- 
erator [5], quantum random number generator, plays a key role. 

From a reliable viewpoint, it is necessary to test stationary randomness degrees 
on shift operations in evaluations. In this section, a list of relevant schemes, pseudo- 
random/truly random sequences, P_value, statistical probability distribution, optical 
statistics, stationary/nonstationary properties, and variant maps, are discussed. 


1.1 Pseudorandom Sequences from Linear Stream Ciphers 


Traditional stream ciphers [6] on Linear Feedback Shift Register (LFSR) structure (in 
military cryptography) are used as pseudorandom number generators, due to the ease 
of implementation from simple hardware, long periods, and uniformly distributed 
streams. The LFSR stream ciphers are the core in classical stream ciphers through 
the mathematical theory of algebraic functions for system simulation and analysis. 

However, an LFSR is a linear system leading to fairly easy cryptanalysis using 
the Berlekamp—Massey algorithm. Important LFSR-based stream ciphers A5/1 & 
A5/2 are used in GSM cell phones and EO is used in Bluetooth protocol. But from 
cryptanalysis viewpoint, the A5/2 cipher has been broken and both A5/1 and EO have 
serious weaknesses [7, 8]. 


1.2 Pseudorandom Sequences from Nonlinear Stream 
Ciphers 


The new generation of stream ciphers [9, 10] is widely used in advanced cyber 
communications. Three general methods are applied to improve security weaknesses 
in LFSR-based stream ciphers: 


1. Nonlinear Functions: Nonlinear combination of several bits from the LFSR 
state [11]; 
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2. Nonlinear Parts: Nonlinear combination of the output bits of two or more LFSRs 
or using evolutionary algorithm for nonlinearity [12]; and 

3. Clock Control: Irregular clocking of the LFSR, as in the alternating step gen- 
erator [13]. 


With batch a series of nonlinear algorithms are emerged [14]: nonlinear equivalence 
[15], evolutionary methods [12], AES cipher [16], RC4 [17], ZUC [11], cellular 
automata [18], and nonlinear dynamic system [19]. 

The new generation of stream ciphers has being shifted from the traditional mode: 
LFSR [6] to various nonlinear modes: NLFSR [20, 21], clock control [13], nonlinear 
functions [11], etc.; it is essential for ciphers to be integrated and implemented 
[22] to satisfy security models. However, different from LFSR with well-established 
linear mathematical theories and simulation tools, it is extremely difficult to use 
advanced nonlinear mathematical theories, recursive models, descriptive tools, and 
implementing schemes [19] in nonlinear dynamic environments. How to evaluate 
cryptographic sequences generated from the nonlinear stream ciphers is an urgent 
problem for modern stream/block ciphers. 


1.3 Truly Random Sequences from Hardware Devices 


In addition to pseudorandom sequences generated by stream ciphers, high-quality 
stochastic oscillators of truly random sequences are generated from special hardware 
devices such as laser photonics [23], nonlinear optics [24], quantum optics [25], 
quantum noises [26], thermal noise [27], and chaos and fractal nonlinear dynamics 
[28]. 

Since various truly random sequences are created from specific physical models 
with special principles and uncertain methodologies, it is extremely difficult for 
cryptographic researchers to make proper measurements explore nonlinear dynamic 
properties. 


1.4 P_value Schemes—Statistical Tests on Cryptographic 
Sequences 


Randomness has being explored for many years [29] on a series of statistic testing 
theories and methods. From a testing viewpoint, it is feasible to apply statistic testing 
packages to measure randomness properties on a given cryptographic sequence. NIST 
800-22 package is a typical representative to provide more than 15 testing schemes 
for evaluation. Using the testing package, it is essential to check whether P_value 
>0.01 for the sequence. Since such measuring scheme provides static property, it 
is difficult to use only P_value parameter to express complex dynamic behaviors 
intrinsically involved in cryptographic sequences. 
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Since comprehensive behaviors in nonlinear dynamics may increase computa- 
tional complexities tragically to involve complicated dynamic properties in the mul- 
tivariate environment, those dynamic behaviors are completely ignored in P_value 
schemes. 


1.5 Multiple Statistical Probability Distributions 


Measuring cryptographic sequences under segment conditions, multiple statistical 
probability schemes are useful to create various distributions to illustrate complex 
spatial relationships. 

Multivariate normal probability distributions are the most important and power- 
ful tool to test stochastic characteristics of a random data sequence [30] under the 
framework of probability, stochastic process, and statistics [31] for nonlinear prob- 
lems. In this kind of measuring models, when a data sequence is sufficiently long, 
the high-dimensional probability distribution of the sequence [32] is converted into 
a continuous Gaussian distribution. 

A typical projection model is shown in Fig. la; the central part shows a Gaus- 
sian surface with an unbalanced distribution in a 2D plane distributed as P(X, Y) 
measures with pseudo-colors and two 1D projections shown in horizontal P(X) 
and vertical P(Y) planes, respectively. In Fig. 1b, a standard Gaussian surface with 


Fig. 1 Multivariate 
Gaussian Probability 
Distributions (a)-(c); a 
Bivariate normal distribution 
with two probability 
projections; b A symmetric 
bivariate normal surface with 
pseudo-colors; e A 2D 
pseudo-color map of the 
symmetric bivariate normal 
surface 
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symmetric shapes is illustrated and the 2D projection of its pseudo-color map is 
shown in Fig. 1c with continuous distribution of color on the map. 

From sample figures, the relationship between the projection curve and two 1D 
Gaussian distributions are observed in the multivariate normal probability environ- 
ment. Multivariate Gaussian probability distributions support various schemes to 
analyze complex stochastic data set of measuring sequences in many applications in 
continuous conditions. 


1.6 Photon Statistic in Quantum Optics 


Photon statistics is the theoretical and experimental approach on the statistical distri- 
butions in photon counting experiments to analyze the statistical nature of photons 
in a light source. 

Three types of statistical distributions shown in Fig. 2 can be obtained by the light 
source [33]: Poissonian, super-Poissonian, and sub-Poissonian. The variance and 
average number of photon counts are identified for the corresponding distribution. 
Both Poissonian and super-Poissonian light are described by a semi-classical theory 
in which the light source is modeled as an electromagnetic wave and the atom is 
modeled by quantum mechanics. In contrast, sub-Poissonian light requires the quan- 
tization of the electromagnetic field for a proper description and is a direct measure 
of the particle nature of light. 


Fig. 2. Three-photon 
statistical distributions 
Sub-Poissonian 


Poissonian 


Super-Poissonian 
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1.7 Stationary and Non-stationary Properties 


In mathematics and statistics, a stationary process is a stochastic process [34] whose 
joint probability distribution does not change when shift operations performed. Con- 
sequently, parameters such as mean and variance, if they are present, also do not 
change over time. Stationarity is an interesting property for many statistical proce- 
dures in time series analysis. 

In 1938, Kolmogorov established the basic theorems for smoothing and predicting 
stationary stochastic processes [35, 36] that had major military applications during 
the Cold War. 

In applied mathematics, the Wiener—Khinchin theorem [37—39] states that the 
Autocorrelation Function (ACF) of a wide-sense-stationary process has a spectral 
decomposition given by the power spectrum of the process. One of the effective ways 
identifying stationary times series is the ACF plot [40]. For a stationary time series, 
the ACF will drop to zero relatively quickly, while the ACF of nonstationary data 
decreases slowly [41]. 


1.8 Datastreams 


1.8.1 Pseudorandom Number Resources 


Four cryptographic sequences are selected: {AES,DES, A5, RC4}. For each cipher, 
a cryptographic sequence of 100MB data streams is collected. 

{ AES, DES} are block ciphers [16] on OFB mode to transfer block cipher output 
as a stream cipher stream. 

A5/1 is a stream cipher [42] based around a combination of three LFSRs with 
irregular clocking. 

RC4 is a stream cipher [43] designed by Ron Rivest in 1987. The design of RC4 
avoids the use of LFSRs, its structure is ideal for software implementation, and it 
requires only byte manipulations. 


1.8.2 Two Quantum Random Number Resources 


Reliable and unbiased random numbers are important in cryptographic applications. 
Many algorithms can be used to generate pseudorandom numbers, but they can never 
be perfectly random or indeterministic. 

Quantum random numbers can be generated from a physical quantum source of a 
coherent laser light to be splitting a beam of light into two beams and then measuring 
the power in each beam. Due to the light intensity in each beam, it fluctuates about the 
mean. Those fluctuations can be converted into a source of random numbers [44—46] 
being a stationary Poisson distribution. 
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Two quantum cryptographic resources are selected: {ANU, USTC}. For each 
quantum cipher, a truly random sequence of 1GB data streams is collected. 

USTC resource: In the Key Laboratory of Quantum Information, USTC, CAS, true 
random number sequences are generated [45]. This type of true random sequences 
supports advanced quantum communication devices of QKD systems [47, 48]. 

More than 20GB quantum random number sequences are provided by USTC for 
randomness testing. 

ANU resource: The ANU Quantum Random Numbers Server is an open website 
[49] to offer true random numbers to anyone on the Internet. Such random numbers 
are generated in real time by measuring the quantum fluctuations of the vacuum. 
The electromagnetic field of the vacuum exhibits random fluctuations in phase and 
amplitude at all frequencies. By carefully measuring these fluctuations, ultra-high 
bandwidth random numbers can be generated. Relevant data streams are downloaded. 


1.9 Variant Framework 


The conjugate classification [50] is proposed to apply seven measures in a hierarchy 
to partition the kernels of four regular plane lattices on n = {4, 5, 7, 9} cases for 2D 
binary images. For 1D cellular automata sequences, global random behaviors [51] 
are visualized in 2D maps. 

Various schemes following the top-down strategy are explored to use multiple 
measures to partition special phase spaces from a top state set to multiple bottom 
states via multilevels of a hierarchy in combinatorial algorithms [52], image analysis, 
and processing for many years. 

For n-tuple bit vectors, the variant logic framework [53] is proposed, and various 
applications are explored: 3D visual method on random number sequences [54], vari- 
ant Pseudorandom Number Generator (PRNG) [55, 56], computational simulation 
on quantum interactions [57, 58], noncoding DNA analysis [59], and bat echoloca- 
tion [60]. 


1.10 Proposed Scheme 


For the convenience of testing stationary randomness on six cryptographic sequences, 
we propose a testing system for a stationary random sequence with length N; multiple 
segments M are divided from the sequence by a given length m; a 2-tuple pair of 
measures can be extracted from a 0-1 segment that is the number of 1 element and 
the number of 01 pattern in the segment. All paired measures are composed of a 
sequence of M pairs of measures as an ordered measuring set with M elements. 
The pairs of the measuring sequence are directly separated as two independent 
measuring sequences to keep each parameter in the same order. A total of three 
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sequences of distinct measures are constructed including two sequences on single 
measures and one sequence on 2-tuple measures. 

Following this approach, two sets of single measuring sequences are sorted as two 
1D numeric arrays as statistical histograms corresponding to 1D maps, and the 2- 
tuple measuring sequence is sorted as a 2D integer array as statistic histograms being 
a 2D map. Under the controlling operations on the changes of shift displacement, 
multiple results of the three measuring sequences are transformed into 1D statistic 
histograms and 2D pseudo-color maps to show effective patterns from the generated 
sequence under various positions and conditions on a list of shift operations. 


1.11 Organization of the Chapter 


This chapter describes a testing system for a stationary random sequence on diagrams 
of the system architecture and the core modules with input/output and processing 
functions in Sect. 2. In Sect. 3, the relationships among measuring sequences and the 
three statistical distribution maps are analyzed. In Sect. 4, four random sequences 
are generated from { AES, DES, A5, RC4} ciphers and two quantum cryptographic 
sequences collected from the Key Laboratory of Quantum Information, USTC, CAS, 
and ANU quantum number site. From the results of the visual maps in section IV, 
numeric analysis and brief comparison are carried out in Sect. 5. And finally in Sect. 
6, the main results are summarized. 


2 Testing System 


To describe the testing system, diagrams are shown in Fig. 3. 


i £ n g | Output l 
i ' Three ' 
ı AGT = ST | SM | CP ans | 
! Sequence | ! Maximals 


Input: A 0-1 sequence 

ST Shifted Transformation 

SM Segment Measurement 

CP Combinatorial Projection 
Output: Three maps / Maximals 


Fig. 3 The architecture of testing stationary random sequences 
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2.1 System Architecture 


This system is composed of five parts: Input, Shifted Transformation (ST), Segment 
Measurement (SM), Combinatorial Projection (CP), and Output. 

The input of the testing system is a selected 0-1 sequence, and its output is 
composed of three maps, two in 1D and one in 2D for visual distributions, and three 
maximals to be processed by ST, SM, and CP modules, respectively. 


2.2 Core Modules 


The testing system consists of three modules: {ST, SM CP}. 
Input: X N =m x M bit sequence; m segment length; M total segments; r shift 
length; 
Output: Three maps {1DP, 1DQ, 2DPQ}; Three maximals {1DP,, 1DQ,, 2DPQ, } 
Process: Shifting r position from X to be Y = X (r) in ST. Making segment measur- 
ing sequences in SM and then projecting three measuring sequences as three maps 
and extracting three maximals in CP. 

Let X, Y be 0-1 sequences with N elements, and the ST module takes the sequence 
X as input, then shift r position on the whole sequence to be the shifted sequence 
Y = X (r) i.e., a cyclic shift right + or shift left —). 


Y = X (r), YU) = XU +r], I +r(modN), 
0< I < N; X[I], Y[I] € {0, 1} (1) 


In the SM module, the shifted vector is inputted and will be divided from a long 
sequence into M segments. For the i-th sub-vector, 0 < i < M on the j-th position 
0 < j < m, denoted as Y; j. 

This sequence at the end of sub-vectors after the segmenting operation forms 
an m x M matrix, m positions for the i-th complete row vector in the sequence 
correspond to a pair of 2-tuple measures: (p;, qi). 


Y = {Yi} (2) 

Y; = {Y;o, Yii, p Yije, Yima} (3) 
0<i<M,0<j<m 

Y; > (pi qi), 0 <i < M (4) 


Ya > {(pi, qo (5) 
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The pair of 2-tuple measures (p;, qi) is determined by the following formula: 


Y,; = Y[J] € {0,1}; J =ixm+ j, 
0<i<M,0<j<m,0<J<m*M (6) 


m-1 


pi = È Yi j, Yj € (0, 1,0 < pi < m; (7) 
j=0 


m—-1 
qi = X IY; j1 Yi) == 0, D], 
j=0 
j — 1(mod m), 0 < qi < |m/2]; (8) 


That is, X = 0011010010, N = 10, M = 2, m = 5; (po = 2, qo = 1); (pı = 2, 
qı =2). 

The SM outputs the ordered M pairs of 2-tuple measures {p;, qi} 

The CP module consists of two units: Split and projection. The split adapts the 
SM’s output as the input, and the 2-tuple measuring sequence {(p;, qi) yn D will 
be splitted into two independent measuring sequences:{ p; } M val gi} a to keep the 
original order invariant. 

Three measure sequences are {p;}45', {qi Y£", (Di Gi) Heo 

The projection unit consists of three steps: Project Array (PA), Color Map (CM), 
and Get Maximal (GM). For three measuring sequences, two types of 1D and 2D 
measures will be processed separately. 

The PA processes measuring sequences to transform them into integer arrays and 
the CM will organize them on either normalized histograms (1D measures) or color 
maps (2D measures), respectively. 

The 1D measures involve two measuring sequences: { pie re {qi} re Let 
P[m +1], Q[|m/2] +1] and NP[m + 1], NQ[|m/2] + 1] be two 1D (integer, 
float) arrays to represent the corresponding elements. 

The 1DP statistic histogram is generated from a sequence { pi} Pe NP,P 
two arrays (floating point, integer) with (m + 1) elements. For the j-th element 
NP{j], PL], 0 < j <m, and 1DP, the maximal element, the output can be ob- 
tained by following procedure: 


M-1 
i=0 


i= 


Initialization: VN P[j] = 0.0, 
P[j]=0,0 < j <m; 
Calculation: for(i = 0; i < M; i + +) 
(Pip lest 
Normalization: for(j = 0; j < m; j + +) 
{N P[j] = P[J]/M; } 
Get Maximal: 1DP, = max{N P[j]|0 < j < m} 


In the 1DP map, the PA corresponds to initialization and calculation; the MA 
handles normalization and the GM identifies the maximal element of the map. 
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The 1DQ statistic histogram is generated from a sequence {qi} o> NQ, Q two 
arrays (floating point, integer) with ((m/2] + 1) elements. For the j-th element 
NQ[j], QL], 90 < j < L|m/2], and 1DQ, the maximal element, the output can be 
obtained from following procedure: 


Initialization: VN Q[j] = 0.0, 
Q[j]=0,0< j < |m/2); 
Calculation: for(i = 0;i < M;i + +) 
{Qlql + +; } 
Normalization: for(j = 0; j < |m/2]; j++) 
{NO[j] = OLj]/M; } 
Get Maximal: 1IDQ, = max{NQ[j]|O < j < Lm/2]} 


Using P, NP, Q, NQ arrays, it is possible to generate corresponding 1D statis- 
tical histograms as 1D maps. 

In the 1DQ map, the PA corresponds to initialization and calculation; the MA 
handles normalization and the GM identifies the maximal element of the map. 

The 2D measures specially processes one measuring sequence: {(p;, qi} re Let 
PQ([m+1: |m/2] + 1] be a 2D integer array. 

2DPQ statistic histogram is generated from a sequence{(p;, q). PQ a 
2D integer array with (m + 1) » ((m/2] + 1) elements; For the i, j-th element 
PQļi, j], 0<i<m,0< j< |m/2], and 1DPQ, the maximal element, their val- 
ues can be obtained by following procedure: 


Initialization: VP Q[i, j] = 0, 

O<i<m,0<j <\|m/2]; 
Calculation: for(i = 0; i < M; i + +) 

{PQ[pi, qi] ++; } 

Pseudo-color: Matching proper color for 
vPO[i, j], 0<i<m,0< j 

Get Maximal: 1DPQ, = max{P Q[i, j]|0 < 
0 < j <|m/2)} 


= 
i 


m/2] 


L 
<m 


In the 2DPQ map, the PA corresponds to initialization and calculation; the MA 
handles pseudo-color and the GM identifies the maximal element of the map. 

Through the CP module, three measuring sequences are transformed into two 
1D arrays and one 2D array with (m + 1), (Lm/2] + 1) and (m + 1) * ((m/2] + 1) 
clusters. 

The outputs of the testing system are three maps {1DP, 1DQ, 2DPQ} and three 
maximals {1DP,, 1DQ,, 2DPQ, } as expected statistic distributions and representa- 
tives of the input 0-1 sequence, respectively. 
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3 Association Analysis 


It is a counting scheme to sort the { pi am ' measuring sequence as a 1D histogram. 
When the measuring sequence meets ideal conditions, the 1D statistical distribution 
is a binomial distribution. 


Lemma 1 For an input 0-1 sequence, if the total number of segments is equal to 
M = 2", and each segment of m bits appears only once in the sequence, then the 
IDP array satisfies the binomial distribution 


m 
p= (Y) osim (9) 


L 


Corollary 1 Jf the input sequence meets the conditions of Lemma 1, then the total 
number of items in the 1DP array is equal to 


m 


> pli = 2" =" (10) 


Lemma 2 Jf the input sequence meets the conditions of Lemma 1, then the 1DQ 
array satisfies following relation: 


avi =2(""), 0 <i < |m/2] (11) 


Corollary 2 [f the input sequence meets the conditions of Lemma 1, then the total 
number of items in the 1DQ array is equal to 


m/2 


X Oli] =2"=M (12) 


Corollary 3 For any 0-1 sequence with N elements, a 2DPQ projection in two 
directions is corresponding to either a 1DP array or a 1DQ array, respectively. 
Proof A 2DPQ array is generated from a measuring sequence {p;, qi} i and 
the 2DPQ array is sorted by a Jo E from two directions P[i] = 
LLT POI, j1,0 <i <m; OLf] = Vio POL, j1,0 < j < |m/2]. So two pro- 
jections are corresponding to an 1DP or 1DQ array. 


Corollary 4 For an arbitrary 0-1 input sequence, the total number of items in the 
2DPQ array is equal to 


m |m/2] m Lm/2] 


Dd Pali. j= me i= > Ojl=m (13) 
j=0 


i=0 j=0 
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In Corollaries 3 and 4, the total number of each component on three statistic arrays 
is equal to the total number of segments M, and the 2DPQ array occupies a central 
position in the projection to other two arrays. 

Let {1DP,.(r), IDQ, (r), 2DPQ, (r) } denote three maximals on the selected se- 
quence for 0 < r < m; three maximal sequences are {1DP, (r) }"".9, {1DQx(r) Y o 
{2DPQ,(r) } o: 

For a 0-1 sequence with M segments, if each segment of m bits is composed of 
a state and only one state is involved, then the sequence is a circular sequence. 


Lemma 3 For a sequence O <r <m, the sequence is a circular sequence, iff 
IDP,(r) = 1DQ,(r) = 1 and 2DPQ,(r) = M. 


Proof For a circular sequence, shift operations do not change the pair of measures, 
only a single (p, q) value is possible. 


Theorem 1 For a sequence with stationary random properties, it has 
IDP, (0) > --- ~ IDP; (r) ~ +++ ~ IDP; (m) & 1, 

IDQ,(0) > --- ~ IDQ, (r) +++ ~ IDQ,(m) « 1, or 

2DPQ,(0) ~ --- ~ 2DPQ,(r) ~ ++. ~ 2DPQ,(m) & 1. 


Proof In any random condition, it is necessary for pairs of {(p, q)} to have certain 
states significantly different from a circular sequence in either < 1 or < M condi- 
tion. Under the stationary random condition, all maximals satisfy only ~ relations 
under shift operations. 


For a G map, let G, be an average variation, AG, be a region of variations, and 
G? = AG,/G, be a variation ratio. 


Theorem 2 For two {i, j}-th G maps G' and G/ on Gi. a Gi with variation ratios 
GÈR and Ge. if a variation ratio has a minimal value, then the relevant map has a 
better stationary random property than the maximal one. 

Proof Since Ge = AG,/G, and Gi & Gi, it is a relative measure on 
Vr(max{G,(r)} — min{G,(r)})/G, => 0. So min{AGi, AGİ} < max{AGi, 
AGİ), the minimal variation ratio indicates the better stationary random property. 


Corollary 5 For different maps, it is better to compare various variation ratios 
relevant to the same type of distributions. 


Proof For various maps in the same type of distributions, relevant {G } should satisfy 
the similar—equal condition. 


4 Testing Results 


Four pseudorandom sequences are generated by {A5,RC4,DES, AES} ciphers, and 
two quantum cryptographic sequences are selected from both ANU and USTC re- 
sources. 
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Fig. 4 Six cryptographic sequences on r = 32 1DP, 2DPQ, and 1DQ maps 
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Fig. 5 Six cryptographic sequences on r = 32 2DPQ maps 


Typical results of testing stationary properties for six sequences on 18 maps of 
{1DP, 2DPQ, 1DQ} are shown in Fig. 4. Each position contains nine shift values of 
r = 32 selected. A total number of 18 maps are included. Six 2DPQ maps are shown 
in Fig. 5 as enlarged maps. Each map has shift values of r = 32, respectively. 

Three variation measures {G,, AG,, GË} for maps {1DP, 2DPQ, 1DQ } of six 
sequences are shown in Table 1, and their sorted orders are listed in Table 2. Twenty- 
four 2D maps of maximal curves for r = 0 — 128 are shown in Table 3. Three left 
columns contain 18 enlarged variation maps of {1DQ, 1DP, 2DPQ} and the last 
column contains six variation regions of 1DQ + 1DP + 2DPQ in six 2D maps. Six 
enlarged 2D maps are shown in Table 4 and six larger 2D maps are shown in Table 5. 
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In Table 6, 49 pairs of differences for variation ratios are listed in three 7 x 7 
tables to illustrate refined quantity measures on three levels. There are seven entries 
on diagonals with seven trivial 0 values. For other 42 nontrivial values, let dG®% 
denote differences of G? % based on the basic variation ratios in Table 1, and various 
differences of variation ratios among six samples are listed. Differences of three 
variation ratios {dQ*%, dP®%, dP Q®%, } on seven items {@, AES, DES, A5, 
RC4, ANU, USTC} are illustrated. 


5 Result Analysis 


Eighteen maps in Fig. 4 are composed of three groups. Six 1DP maps have similar 
distributions in bell shapes to illustrate Poissonian distributions. Six 2DPQ maps are 


Table 1 Comparisons on 
three variation measures for 


six samples 1DQ: |Q% AQ,% OR% 


AES: |14.05 0.42 3.0 
DES: |14.05 0.36 2.53 
AS: 13.953 0.19725 1.4136 
RC4: |14.210 0.21985 1.5471 
ANU: |13.961 0.17761 1.2722 
USTC:}13.944 0.19664 1.4102 


IDP: |P.  AP,% PÈ% 


AES: |7.07 0.42 3.96 
DES: |7.05 0.25 3.5 

A5: [7.02650 0.17665 2.51409 
RC4: |7.19459 0.16223 2.25498 
ANU: |7.0352 0.15472 2.1992 
USTC:|7.0289 0.13542 1.9265 


2DPQ:|PO,% APO,.% PQR% 


AES: |1.0 0.09 9.02 
DES: |1.0 0.08 8.21 
A5: [0.98690 0.05508 5.5818 
RC4: {1.02754 0.05106 4.96913 
ANU: |0.99245 0.04791 4.8276 
USTC:|0.98675 0.04691 4.7544 
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Table 2 Possible sorted orders of three types of variation measures; (a) Gx %, (b) AG; %, (c) G Ls % 
Gx% | min max| min - max sorted | min-max range 


1DQ: [USTC RC4|USTC-A5-ANU-AES-DES-RC4| 13.944 < Q,% < 14.21 
1DP: | A5 RC4|A5-USTC-ANU-DES-AES-RC4| 7.0289 < P.% < 7.19459 
2DPQ:|USTC RC4|USTC-A5-ANU-DES-AES-RC4]|0.98675 < PQ,% < 1.02754 
(a) 


AG,%| min max min - max sorted min-max range 


1DQ: | ANU AES|ANU-USTC-A5-RC4-DES-AES] 0.17761 < AQ, % < 0.42 
1DP: |USTC AES|USTC-ANU-RC4-A5-DES-AES]| 0.13542 < AP, % < 0.42 
2DPQ:|USTC AES|USTC-ANU-RC4-A5-DES-AES|0.04691 < APQ,% < 0.09 
(b) 


GÈ% | min max | min - max sorted | min-max range 


1DQ: | ANU AES|ANU-USTC-A5-RC4-DES-AES| 1.2722 < Qf% < 3.0 
1DP: |USTC AES/USTC-ANU-RC4-A5-DES-AES] 1.9265 < PË% < 3.96 
2DPQ:/USTC AES|USTC-ANU-RC4-A5-DES-AES|4.7544 < PQR% < 9.02 
(c) 


2D distributions. They have a symmetry on left/right directions and have a broken 
symmetry on up/down directions. Pseudo-color pixels on six maps indicate relevant 
3D shapes. Compared with six 1DP maps, six 1DQ maps have similar distributions 
and more narrow bell shapes to illustrate sub-Poissonian distributions. It is possible 
to illustrate different maps on shift r = 32 for each map. 

In Table 1, three pairs of maximal and minimal variation ratios are identified 
and three full orders are sorted in Table 2. Compared with G, sorted orders, both 
{AG,, GË} variation ratios, six samples keep the same sorted orders as two groups: 
1DQ and {1DP, 2DPQ} for their min-max variation ratios. Six enlarged 2DPQ 
maps on shift r = 32 are shown in Fig. 5 to form three pairs {AES:DES, RC4:AS, 
ANU:USTC}. Three pairs of six maps have similar visual distributions. 

Twenty-four variation maps are shown in Table 3 as four groups. Each group 
contains six 2D maps. For three groups of { 1DQ, 1DP, 2DPQ} variation distributions, 
eighteen enlarged 2D maps are shown in significant waveforms. For the group of 
1DQ + 1DP + 2DPQ distributions, six maps are shown in three average variations 
satisfying 1DQ, > 1DP, > 2DPQ,, respectively. The fourth group of variation 
measures combines three variations of 1DQ + 1DP + 2DPQ in one unified 2D maps. 
From the six 2D maps, their stationary randomness of global variations are clearly 
illustrated. 

In Table 4, AES and DES map may have high frequent waves, and other enlarged 
2D maps have stationary properties. In Table 5, larger waves appear and more details 
could be identified. Although significant variations are appeared in different 2D 
maps, it is difficult to make classification depending on their variation behaviors. 
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Table 3 Variation distributions of six samples 
1DQ IDP 2DPQ IDQ+IDP+2DPQ 


AES 


DE 

— _ 
= 

= 

-3 
2 
—— 
-a 
= 


Jal Aa oo ES 
| Ni y 1 \ y yi 


We D i 


A5 


RC4 


ANU 


USTC 


In Table 6, three variation ratios of differences are bounded in 0.0034 < |d QÈ %] < 
1.73,0.056 < |\dP®%| < 3.96, and 0.073 < |d P Q2 %| < 4.27, respectively. In gen- 
eral, three groups of variation ranges on differences meet {d QF %} C {dP®%} C 
{d P QF %}. From a stationary testing viewpoint, 2DPQ shows the strongest distinct 
property, 1DQ has the weakest numeric property, and 1DP provides the middle iden- 
tifying property. 
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Table 4 Six variations on 2D maps 


AES_maximum_of_each_step DES_maximum_of_each_step 
ord a aaa ond =a E 
012 012 
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Since three groups can be identified by {AES, DES} block ciphers, {A5, RC4} 
stream ciphers, and {ANU, USTC} quantum ciphers, stationary randomness quanti- 
ties can be classified as three { AES, DES}-highest, {A5, RC4}-middle, and {ANU, 
USTC}-lowest categories to provide distinct variation measures in the testing. Three 
quantity categories may correspond to distinguish artificial, semi-artificial, and nat- 
ural designs for various generating mechanisms of cryptographic resources. 
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Considering all differences of variation ratios on six samples listed in Table 6, 
there are only 0.0034—4.27% differences (thirty-four in one million to four percent) 
are recognized. From a measuring viewpoint, all six samples are showing distinct 
stationary randomness properties. 


Table 5 Larger six variations on 2D maps 
RC4_maximum_of each step 


AES_maximum_of_each step 0.26 
auf 
012 
0.10 
a) 0.08 
4 
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0.02 
0.00 
0 20 40 60 EJ 100 120 140 
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orn 
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jaa A 008 
A 
0.06 
0.04 
0.02 
«s 20 40 60 C] 100 20 
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Table 6 Differences of variation ratios among three maximals of six samples 


dQ% Ø AES DES A5 RC4 ANU USTC 


D 0 -3.0 -2.55 -1.4136 -1.5471 -1.2722 -1.4102 
AES 3.0 0 0.45 1.5864 1.4529 1.7278 1.5808 
DES 2.55 -0.45 0 1.1364 1.0029 1.2778 1.1398 
AS 1.4136 | -1.5864 -1.1364 0 -0.1335 0.1414 -0.0034 
RC4 1.5471 | -1.4529 -1.0029 0.1335 0 0.2749 0.1369 
ANU: 1.2722 | -1.7278 -2.2778 -0.1414 -0.2749 0 -0.138 
USTC: 1.4102 | -1.5898 -1.1398 -0.0034 -0.1369 0.138 0 


dP% ØD AES LFSR A5 RC4 ANU USTC 


a) 0 -3.96 -3.5 -2.51409 -2.25498 -2.1992 -1.9265 
AES 3.96 0 0.46 1.44591 -0.54996 1.7608 2.0335 
DES 3.5 -0.46 0 0.98591 1.24502 1.3008 1.5735 


A5 2.51409|-1.44591 -0.98591 0 0.25911 0.31489 0.58759 
RC4 2.25498| 0.54996 -1.24502 -0.25911 0 0.05578 0.32848 
ANU: 2.1992 | -1.7608 -1.3008 -0.31498 -0.05578 0 0.2727 
USTC: 1.9265 | -2.0335 -1.5735 -0.58759 -0.32848 -0.2727 0 


dP% Ø AES DES AS RC4 ANU USTC 


Ø 0 -9.02 -8.21 -5.5818 -4.96913 -4.8276 -4.7544 
AES 9.02 0 0.81 3.4382 4.05087 4.1924 4.2656 
DES 8.21 -0.81 0 2.6282 3.24087 3.3824 3.4556 
AS 5.5818 | -3.4382 -2.6282 0 0.61267 0.7542 0.8274 
RC4 4.96913|-4.05087 -3.24087 -0.61267 0 0.14153 0.21473 
ANU: 4.8276 | -4.1924 -3.3824 -0.7542 -0.14153 0 0.0732 
USTC: 4.7544 | -4.2656 -3.4556 -0.8274 -0.21473 -0.0732 0 


6 Conclusion 


It is feasible to evaluate stationary properties for a random sequence using the test- 
ing system. Using three maps {1DP, 1DQ, 2DPQ}, a series of variation measures 
and their ratios are illustrated. Extracting maximal measures is identified for shift 
r : 0 — m. For each sample, three 2D maps of variation curves provide refined char- 
acteristics to evaluate stationary randomness properties in global. Sample varia- 
tion maps are shown in exactly similar—equal relationships among the same group 
of average variations. Further explorations and applications are required to check 


154 J. Zheng et al. 


the testing system on other applications of cryptographic streams. Three quantity 
categories of artificial, semi-artificial, and natural designs may be explored to get 
intrinsic stationary randomness information from refined testing and future explo- 
rations. 
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Part IV 
Theoretical Foundation—Meta Model 


TAO produced the First—[Heaven]. 

The First produced the Second—[Earth]. 

These Two produced the Third. 

The Third produced all things, 

and these turn their back upon the Yin and embrace the Yang. 
The intermingling of these two Afflati results in harmony. 
—Lao Tzu (Tao Te Ching) 


Knowledge has the form of a tree, and since metaphysics is the most 
fundamental one of the theoretical disciplines, it represents the roots 
of the tree. 


—Gonzalo Rodriguez-Pereyra 


Meta-design is much more difficult than design; it's easier to draw 
something than to explain how to draw it. 


—Donald Knuth 


From a historical viewpoint, the meta model was developed early than variant logic 
that provides useful concept and hierarchical organization to support this new logic 
framework. The core paper of concept cell (Concept Cell Model for Knowledge 
Representation) was published in Int. J. Inf. Acquisition 01, 149-168 (2004), World 
Scientific Press. In relation to multiple probability approach, a research paper 
(Voting Theory for Multiple Candidates to Resolve Intrinsic Uncertain Problems of 
Election) was published in Journal of System Engineering Theory and Practices 
(Chinese) 1000-6788(2002)12-0101-10. This paper proposed a useful multiple 
probability model to resolve intrinsic uncertain properties in election. 

Part IV is composed of two chapters (9 and 10). 

Chapter “Meta Model on Concept Cell” outlines a meta model on concept cell 
for knowledge representation to provide a brief core structure on this network 
topology scheme for three levels of knowledge clusters. 
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Chapter “Voting Theory for Two Parties Under Approval Rule” describes voting 
theory for two parties under approval rule to show multiple probability model also 
useful in two-party conditions. 


Meta Model on Concept Cell A) 


Check for 
updates 


Jeffrey Zheng and Chris Zheng 


Abstract Applying network topology schemes, two types of three levels of meta 
knowledge representations have been established. This chapter proposes a meta 
model on concept cell that provides a meta organisation of knowledge in natural 
and artificial intelligent systems structurally. 


Keywords Knowledge model - Meta representation - Three levels of concept 
lattice - Description - Procedure - Core organisation 


1 Introduction 


A meta model on concept cell is outlined to represent knowledge in knowledge 
systems (KSs). This model has novel features that are of considerable interest for 
knowledge representation (KR). 

Polanyi proposed a knowledge model in the 1940s. Knowledge is composed of 
two categories: tacit and explicit [1, 2]. In the 1970s, Anderson from a cognitive 
psychology identified knowledge with another two categories: declarative and pro- 
cedural [3-5]. In the early 1990s, a procedural model was proposed by Nonaka who 
identified four transformations: tacit — tacit (socialisation), tacit —> explicit (exter- 
nalisation), explicit —> explicit (combination) and explicit tacit (internalisation) [6, 
7]. In 2000, a model was proposed by Nickols to arrange four classes (tacit, explicit, 
procedural and declarative) into three categories: tacit, explicit and implicit. In my 
opinion, the Nickols model is unsatisfactory for three reasons: 
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(i) it is a triangle of categories without a fixed order, 
(ii) there is uncertainty in implicit category and 
(iii) there is no structural correspondence to other KR methodologies. 


To improve the first two weaknesses of Nickols approach, an executable knowl- 
edge model was proposed. A triplet (tacit, implicit and explicit) is constructed 
as a procedural structure. Implicit in it is the middle node linked with two other 
nodes in four transformations: tacit > implicit (externalisation), implicit —> explicit 
(retrieval), explicit > implicit (category) and implicit — tacit (internalisation). In 
addition, the model provides distinguishable foreground/background and human/ 
machine knowledge interfaces [8]. 

To explore different KS applications from philosophy, logic and digital libraries, 
to gene, chemistry, software and system engineering [9-11], people arrange common 
concepts to construct ontology libraries and procedures as core structures [12, 13]. 
Advanced system modelling tools such as ARIS [14], CIMOSA [15] and IDEF [16] 
provide function, data and process models and ontology description capture method- 
ologies for constructing modern intelligent knowledge systems [17]. Because many 
contradictions, confusions, difficulties and unclear properties exist in KR foundation 
levels [13, 18, 19], consistently categorising practical knowledge into tacit/explicit 
and procedural/declarative is extremely hard for researchers, scientists, philosophers, 
psychologists and knowledge workers [14—17, 20, 21]. 

Practical computer-aided modelling systems use pragmatic approaches to manip- 
ulate simple structures (list, tree, stack, class and component) in real applications 
[14-17, 21]. Usually, declarative concepts seem easier to capture than procedural 
concepts. Based on this, many people believe that declarative knowledge is explicit 
and procedural knowledge is tacit [16, 17, 22]. A radical extension of a knowledge 
model in KR is proposed in a concept cell that arranges knowledge in KS for natural 
and artificial organisation. This model can fully support the above-mentioned knowl- 
edge models to consistently identify four categories of knowledge: tacit, explicit, 
declarative and procedural. The model also provides a core ontology to distinguish 
a hierarchy of structures within the core of a concept. According to convention, the 
word concept is used as an equivalent to knowledge in this chapter. 


2 Concept Cell Model 


Let K denote a cell of concepts (a concept cell) that is composed of three parts: M 
membrane, N nuclei and G gel. M is a frame that provides a container to hold both N 
and G. Gis a base description of the content and N establishes a foundation of the cell. 
M inputs provide external concepts (externals) for N from deeper levels, and then 
output current content to other upper level cells. N is composed of two components: 
D declarative nucleus and P procedural nucleus. To illustrate this organisation, a cell 
K = M, N, Gis shown in Fig. 1. 
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Fig. 1 A concept cell K. a A 
slice, b hierarchy 


Tacit 
P Knowledge 


VES. 
Ea” 


K= {M, N, G} 
(a) A cell K 


N: Nuclei (Foundation) 


Here it is 

M = interface: N = {D, P}; G = base: 

D = {explicit, implicit, tacit, core}; 

P = {life-cycle, start, operation, finish} 
Above it is 

M membrane: N Nuclei; G Gel: 

D Declarative Nucleus; 

P Procedural Nucleus 


(b) Ontology of K 


For the convenience of construction, a special lattice is employed [23]. Only 
directed graphs are used similar to the most popular signal flow graphs [24] to analyse 
and syntheses process control [9], computer architecture [10], electric circuits [25], 
network topology [26-28] and dynamic systems [25, 29]. However, no lattices allow 
containing a loop and all lattices are composed of directed acyclic graphs [26, 28]. 
In a lattice, a node represents a cell and lattice links are determined by dependencies 
among nodes. Because the most complex part of a cell is its nuclei structures, detailed 
interior organisation is necessary to explore meanings of knowledge. To simplify, a 
simple cell (or a cell, if there is no confusion) is studied here, where nuclei of the 
cell are composed of only one declarative lattice and one procedural lattice. 

Using lattice language, a cell K is described in Fig. 2. Different graphic symbols 
represent distinct forms of concepts as nodes. A rounded rectangle represents a 
general node; an octagon is a specific node; a rectangle shows a declarative node 
and an oval corresponds to a procedural node. A simple lattice cell is composed of 


162 J. Zheng and C. Zheng 


K Cell 
M N > 
Interface between extemals and intemais | ~” Declarative link K as. pal 
— unk M Interface 


—> Procedural link N Foundation 


N ={D. P} 
N 
Foundation 


D Declarative lattice 
P Procedural lattice 


D ={C. T. I, E} nodes 


c Core 

T Tacit 

l Implicit 

E Explicit 
P = {L. S. O, F} nodes 


Life-cycle 


Explicit Implicit Tacit Core Life-cycle Start Operation Finish 
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Fig. 2. A concept cell in lattices 


four levels: A node M that interfaces between externals and internals are the first 
level. Two nodes of G and N link with an M node is the second level. The node G 
contains the base description and the node N plays a foundation role in the cell. Two 
nodes of D and P link with node N on the third level. Node D contains one lattice 
in declarative dependency and node P contains one lattice that assumes procedural 
dependency. Finally, two sets of nodes linked with nodes D and P at the fourth level. 
Each node of D or P contains four nodes, respectively. Among each four nodes, two 
links are associated with three nodes. 


3 Core Components 


The following four conditions can create the content of a concept cell: 


(i) M acts as an interface to import a finite number of externals into nuclei and to 
export the content to other cells. 
(ii) G provides the base description of the cell and N collects all externals from M 
for development. 
(ii) Two lattices D, P are constructed from Ns externals to carry out two dependen- 
cies. 
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An N external corresponds to a D node. A declarative dependency is employed 
to order all nodes of D as a declarative lattice. If two distinct nodes have declarative 
dependency, then the node with more general meanings is located at the first node and 
a declarative link connects from the first to the second. After building up declarative 
dependency among all nodes, D becomes a directed acyclic lattice. 

Instances of an N external correspond to nodes of P satisfying procedural depen- 
dency. P is composed of sequences of nodes by instances of externals. If two instances 
represent two nodes, then the node that has to be handled earlier is specified as the 
first node and a procedural link connects two nodes from the first to the second. 
After all procedural dependencies are established among nodes, P is converted into 
a directed acyclic lattice. 


(iv) Two lattices are composed of eight distinguishable node sets: 


Four sets of declarative nodes C, T, I, E are identified: C core, T tacit, I implicit 
and E explicit, respectively. 

Four sets of procedural nodes L, S, O, F are identified: L life cycle, S start, O 
operation and F finish. 

The meanings of the construction process can be explained as: In the first level 
of kernel, M collects all externals to provide extra knowledge for its nuclei. The 
second level has two parts: G, N. The G node provides the base description. To map 
each external as a node, the number of N externals has the same number of nodes 
in D. A declarative dependency is valid for all D nodes that create a directed acyclic 
declarative lattice. Using instances of N externals as nodes, P has been assembled 
using procedural dependency linked with selected nodes and finally to form P itself 
as procedural lattice. Since both declarative and procedural lattices are organised 
by ordered dependencies, declarative and procedural lattices are directed acyclic to 
support wider requirements from theoretical foundations to practical applications. A 
simple construction example is shown in Fig. 3(—-v). 

For an acyclic lattice, four distinct node sets are notable in Fig. 3(vi). They are 
(singleton, source, branch and sink) node sets, respectively, borrowed from network 
topology [23, 26, 30]. A singleton node provides an isolated concept. A source node 
exports a concept. A sink node imports concept(s) and a branch node transfers con- 
cept(s) from input link(s) to output link(s). If there is only one external in N, then the 
singleton set contains one single node and the other three sets are empty. If there is 
more than one node in N, then the singleton set is empty. In this case, the source set 
is composed of nodes that have at least one link to another node; however, a source 
node does not have a link from other nodes. Each node must have at least one in 
branch, or sink set consequently. In contrast to the source set, a sink set collects all 
nodes with links from other nodes, without a link to a node. A sink node has to be the 
last node in a node path of a lattice to which at least one node is linked, from branch 
or source set. Unlike source and sink sets, a node in a branch set may link with at 
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Fig. 3 External concepts, declarative and procedural lattices and node sets 


least two nodes to and from source, sink, and branch sets. A branch node receives 
from other node(s) and outputs to other node(s). These sequence nodes provide con- 
nectivity among nodes. Although four node sets can be identified by their different 
connectivity, it is not convenient to use the same vocabulary to describe two dis- 
tinct lattices under different dependencies. For convenience, each node set includes 
a proper name to indicate its specific relationship in familiar KR terms. D lattice 
represents an invariant structure (the simplest cases: tree, list) similar to a traditional 
data structure hierarchy. Because a sink node is equivalent to a factor data at the leaf 
level (at the lowest location) of data structure, the sink node has to be represented as 
an explicit knowledge. Therefore, the sink set of D is explicit. In contrast, a source 
node provides invaluable knowledge from the highest level of externals. There is 
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no link to this node and anyone wanting to explain the meaning of the node must 
capture knowledge from other sources far beyond the node itself. Consequently, a 
source node always contains deeper meanings than those can be articulated. Hence, 
the source set of D is tacit. Different from sink and source sets, a node in a branch 
set has connectivity from higher tacit node(s) and to higher explicit node(s). The 
branch set of D represents a typical intermediate property. Consequently, the branch 
set of D is implicit. A singleton node provides a complete concept. The node itself 
is the central of the D lattice. Therefore, the singleton node set of the D lattice is 
a core. Four node sets of P lattice satisfy different properties. The P lattice has a 
close relationship to process modelling that provides a time arrow as controllable 
sequences. A node in the P lattice is an instance of a node in the D lattice. The sin- 
gleton node set of the P lattice is not empty if only one node is in the P lattice. The 
singleton node set of procedural lattice represents a complete procedure of P itself. 
Logically, the procedural singleton node set is a life cycle. When two or more nodes 
are included, three node sets of the P lattice have to link together in sequential rela- 
tionships. Time relevant sequences in finite numbers of connected nodes, must have 
distinguishable commence and end nodes that correspond to start and finish condi- 
tions respectively. In addition, all intermediate nodes provide operational capacities 
to deliver knowledge to consequent nodes. Consequently, three node sets of the P lat- 
tice are called: start, finish and operation, respectively. The relative properties of the 
cell model with other schemes are compared in Table 1. In the table, TM represents 
Theoretical Model that is used in KS applications. ST denotes Structural Theory that 
uses structured organisations to represent complex dependency among members. ES 
indicates Engineering Systems that provide mixed theories, experiences and skills 
with commercial system modelling tools for pragmatic applications especially in 
enterprise management, manufacturing and building industries, software and hard- 
ware systems, global communication networks, web and Internet environment. ES 
applies advanced TM methodologies plus business experiences and engineering kills 
to solve practical problems efficiently using system engineering methodologies in 
global business explorations. 

From this comparison, it is clear that existing systems that are the most similar to 
the cell concept model come from enterprise modelling that provides all functionality 
for ten meta nodes from engineering practices. However, other theoretical models 
cannot support full functionality. This property indicates the potential capacity for 
applying the cell concept model from theoretic foundations to practical applications. 
Details of the concept cell have published [31] to represent further classifications, 
recursive constructions, non-simple cells and sample applications for knowledge 
construction systems. 
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Table 1 Comparisons on different models 


Model D T I ECP SS O F L Notes 
Concept Cell ST ST ST ST ST ST ST ST ST ST] A hirarchy of four levels 
Polanyi[1,2] TM TM Tacit and Explicit 
Anderson[3-5] TM TM Declarative and Procedural 
Nonaka[6,7] ST ST Four Transformations 
Nickols[8] TM T TM Tacit, Explicit and Implicit 
Zheng et al.[9] ST ST ST Four Transformations 
Lattice Theory[17,28] TM TM TM TM TM TM Theoretical Model 
Ontology Metalogic[13-15] [TM TM TM TM TM Logic, Metaphysics, ... 
Conceptual graphs[15] TM TM TM TM TM| Logic reasoning in graphs 
/First order logic[15] ES ES ES ES ES /Symbol notations 
Enterprise Modelling [11-14,27]] ES ES ES ES ES ES ES ES ES ES ARIS, IDEF ... 
Object Oriented[ 13,27] ES ES ES ES ES ES ES| OTM, UML, C++, Java 
Function/.../Logic[13,27] ES ES ES ES ES |Algol, Fortran, Lisp, Prolog 


Ten basic symbols: {D, T, I, E, C}, { P, S, O, F, L} 

D: Declarative; T: Tacit, I: Implicit, E: Explicit, C: Core; 

P: Procedural; S: Start, O: Operation, F: Finish, L: Life cycle 
Three types of models: {ST, TM, ES} 

ST: Structural theory 

TM: Theoretical model 

ES: Engineering system 
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Voting Theory for Two Parties Under R) 
Approval Rule get 


Jeffrey Zheng 


Abstract The Simple Ballot Model (SBM) and the Component Ballot Model 
(CBM)—are proposed for solving uncertainty in an election when two candidates 
gain the same number of votes under the approval rule. The SBM establishes a 
framework to support counting. In separating the two candidates, it is essential to 
extract additional information from dominantly valid votes. The CBM uses probabil- 
ity matrices, vectors and permutation group as components. A stable-voting mecha- 
nism under permutation invariant can be created to distinguish candidates. The result 
of the chapter establishes a voting authority to resolve uncertainty of two candidates 
under the approval rule. 


Keywords Approval rule - Permutation invariant - Feature vector - Uncertainty 
Voting system 


JEL Classifications D72 - D81 - C34: C31 


1 Introduction 


As a common practice in a modern democratic society, voting is a practical way 
to resolve a contest where each candidate seeks to gain maximal support from the 
electors. Approval voting is a voting procedure in which electors can vote for as 
many candidates as they wish. Each candidate approved of receives one vote and 
the candidate with the most votes wins. Approval voting, unlike more complicated 
ranking systems, is easier and simpler for electors to understand and use. This voting 
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method has been widely used today by various governments and organizations around 
world (including the use by the United Nations to elect the secretary-general). 

To keep healthy economic and political progress in modern democracy societies, it 
is necessary to apply reliable and convenient voting methodologies and tools to ensure 
fairness, efficiency and transparency and to overcome paradoxes and difficulties in 
elections. 


1.1 Brief Review of Voting Systems 


We can find interesting voting-based models and practices in many ancient stories 
from Chinese literature to Roman and Greek history. Just before the French rev- 
olution in the French Academy, de Borda [1] and de Condorcet [2] proposed the 
Borda rule and the Condorcet procedures. They wanted to use new voting methods 
to resolve difficulties and unfair results under traditional plurality-based voting rules 
in elections for the Academy. In 1920s, Hotelling [3] investigated the equilibrium 
of spatial economic competition for two firms between location and price. During 
World War II, von Neumann and Morgenstern [4] developed Theory of Games using 
differential equations to investigate complicated competition behaviors. This theo- 
retical foundation has a superior influence to develop analytical methodologies and 
tools from applying pre-designed strategic policies to predicting practical election 
outcomes. Under fairness conditions, Arrow [5] proved his famous Impossibility 
Theorem which claims that there is no single election procedure to fairly decide the 
outcome of an election involving more than three candidates. Various ideas, methods 
and technologies have emerged to resolve voting difficulties [6-9]. 


1.2 Problems in the 2000 American Election 


The most debatable problem in the 2000 American election, the 2K-election, is that 


Whether the machine-rejected ballots need to be manually recounted? 


The practical solution of the 2K-election problem was finally decided by the nine 
judge’s votes in the US Supreme Court on the lawsuits from the Florida Supreme 
Court. 

This indicates that current voting theories and vote-counting models are all faults 
to be an authority resolving the problem. 

Although the 2K-election is under the plurality rule, not under the approval rule, 
however the approval rule cannot guarantee to avoid the similar uncertainty when a 
large number of electors are involved. It is necessary to establish relevant theoretical 
structure to avoid possible problems in the future. 
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1.3 Structure of the Chapter 


This chapter proposes two models constructing a voting theory to resolve the 2K- 
election-like problems and other paradoxes in voting practices. Only one voting 
system under approval rule is concerned. 

In Sect. 2, a Simple Ballot Model (SBM) is proposed. Using the SBM, the sep- 
arable and uncertain conditions for the ballot papers are established. To show some 
practical strategies and relevant problems in current voting methodologies, four addi- 
tional rules (reducing error probability, merging other candidate votes, re-election, 
and court decision) that are commonly used in practical voting processes are dis- 
cussed. 

In Sect. 2.8, the error margin for the 2K-election problem is analyzed. Through 
voting practice is not an accurate science, but the error margin of 0.233% in the 
event still cannot be acceptable as an accurate measure. Although almost 99.8% 
of the valid votes were counted, there is still no way of determining that who is 
the winner. Therefore, the attentions shifts to the 0.2% votes which were already 
deemed invalid. This problem highlights that the voting system needs to improve, 
and a method of extracting additional information from valid votes to separate the 
two candidates under uncertainty conditions becomes essential. 

In Sect. 3, anew voting model—the Component Ballot Model (CBM)—is defined 
and constructed to provide the essential construction for extracting more informa- 
tion from votes for comparisons. Based on multiple feature matrices (similar to con- 
tingency tables in classical statistics), probability feature vectors and permutation 
invariant group and other advanced mathematical tools, multiple pair sets of fea- 
ture index families for two candidates are constructed. This mechanism establishes 
a voting authority to make a decision for an election. After the mathematical defi- 
nitions and constructions to feature matrix, feature vector, probability feature vector 
and feature index, the most important results are summarized in Two-D Separable 
Proposition and Voting Authority Proposition. 

Taking into account only the valid votes, the election model will have intrinsic 
stability for the reliable results immediately after the election. Confusion, frustration 
and dissatisfaction as those experienced in the 2K-election can be avoided. 

In the light of this research, some further research directions are suggested in 
Sect. 4. 


2 Simple Ballot Model 


2.1 Key Words in Election 


Key words used in an election event can be defined as follows. 
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Election—a special event based on counting votes for a winner (normally whoever 
attracts the most votes wins the election) 

Candidate—a person who has been nominated in an election 

Elector—a person who may legally vote in an election 

Ballot—a pre-designed form used to record choices of an elector 

Vote—a ballot on which the choices of an elector are recorded 

Poll—the collections of votes from all legal electors 

Decision—Za result on who wins the election. 


The Simple Ballot Model simulates the simplest case scenario of whole voting 
procedure based upon all ballots directly collected from an election under approval 
rule. In this scenario, one elector can only create one vote for as many candidates 
selected from a list of candidates. 


2.2 Definitions 


For an ideal election involving n (>2) candidates, let C = {c1, C2, ..., Cn} bea 
set of the selected candidates. A ballot B = (c1, c2,..., Cn) is a pre-designed form 
containing the list of candidates for whom the electors may vote. 

A vote is a record of a ballot B. Let a vote denote v. It is valid if v = 


(Vj, V2, ..., Un), E{0, 1},i € [1,n], $; v; > 0, otherwise if Iv; = x ¢ 
{0, 1},¢ € [1,n] or oi v; = 0 (null selection), then the vote v is invalid; where 
v; = | indicates selected the candidate c;, v; = 0 indicates not selected c; and 


v; = x indicates invalid selection to c;. Normally a vote v has a value region from 
(0,0,...,0) to(1,1,..., 1)... (wv, x,...,x). 

An elector can only create one vote and there are a total number of N (>n) votes 
in the election. 

A poll V is a vote collection in which all votes can be arranged as an array with 
N entries: 


V = (v(1),..., v(t),..., (N), te [1, N]. (2.1) 


where v(t) denotes the vote of the tth elector. As each candidate has a number, let 
k € v(t) denote the tth elector selected the kth candidate on the vote. 

For example, n = 6, N = 8, a poll V is: V = (v(1),..., v(t),..., v(8)), te 
[1, 8] 


u(1) = (0, 0, 1, 1, 0, 0), v(2) = (0, 1, 0, 1, 0, 0), v(3) = (0, 1,0, 1, 1, 0), 
v(4) = (1,0, x, 1, 1, 0), v(5) = (0, 1, 0, 1, 0, 0), v(6) = (0, 0, 1, 1, 1, 0), 
u(7) = (0, 0, 1, 0, 0, 0), v(8) = (0, 0, O, O, O, 0) 


In this poll, {v(1), v(2), v(3), v(5), v(6), v(7)} are valid votes (v3(1) = v4(1) = 1 
indicates the 1l-st vote selected the third and forth candidates). In addition, v(4) 
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contains an uncertain selection (v3(4) = x) and v(8) is a null selection, both votes 
are invalid. 

Let Vo denote the invalid-poll in the election. It collects all invalid votes from 
the poll V. Let Vc denote a valid sub-poll in the election. Both sub-polls Vc and Vo 
partition the poll V. i.e. 


V=VcUWV. 


Let V; denote a sub-poll in the election. For any k € [1,7], Vg collects all valid 
votes from the poll V for the kth candidate. 


Vk = {u()lue(t) = 1,k € [1, n], t € [1, N], v(t) € Ve}. 
Let V denote a poll vector, 
V = (Vo, Vint: Vise Vn), eal. (2.2) 


A SBM is a collection of a ballot form, all votes, poll and poll components for an 
election. 


SBM = (B\V;V) (2.3) 


Let Nyc denote the number of votes in the valid poll Vc, Nye = |Vc|. Let Ng 
denote the number of votes in the valid poll Vi, Ng = |Vi|, k € [1, n] and No denote 
the number of votes in the invalid poll Vo. 

The total number of votes in an election, N, is equal to the sum of the number of 
the valid votes Ny, plus the number of all invalid votes No, i.e. 


N = Nyce + No. (2.4) 


Let pye = |Vc|/|V|= Nyc/N denote a measure of the valid votes. 

For any poll vector y, let pe = |Vel/|VI= Ne/N, 1 < k < n denote a measure 
of the kth candidate and po = |Vo| / IVI = No / N denote the measure of the invalid 
votes. 

Under the approval rule, there are many overlaps among different sub-polls. Con- 
sidering two candidate sub-polls and their common parts, if 3k, l € [1, n], Vk, Vi C 
Vc, Vk O Vi Æ Ø, then 


[Vk U Vil = [Vel + [Vi] — [Vk N Vil (2.5) 
In general, we have 


[Vk U Vil < |V] + |V] (2.6) 
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Let Ọ denote a frequency vector, 


W = (Po, Pis- -+ Pro- Pn)» KET] (2.7) 


2.3 One-Dimensional Feature Distribution 


The frequency vector U corresponds to a density distribution. There are equations 
as follows. 


1 = pyc + po; (2.8) 
1> p20, ke[l,n]. (2.9) 


Because there is no further partition among sub-polls, the vector Ẹ is composed 
of a one-Dimension frequency feature histogram. 
Considering inequalities (2.6), (2.8) and (2.9), there is an inequality. 


1<) pan. (2.10) 
k=0 


If sub-polls partition the poll, then there is 1 = }`}—o px. In the worst case 
scenario, if all valid votes select all candidates without invalid votes, then 


n 
Po = 9, pi = +++ = Pn = 1, X msn 
k=0 


2.4 Separable Condition 


When i, j € [1,7], pi, pj > po, a decision between the candidates i and j can be 
made if and only if 


[pi — pj| > Po (2.11) 


This is the separable condition. 


2.5 Uncertain Condition 


However, there will be intrinsic difficulties to make a decision between the candidates 
i and j simply from their measures p; and pj, if 
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[Pi = P;| < Po (2.12) 


This is the uncertain condition. 
Under the uncertain condition, there are no simple solutions to distinguish signals 
clearly between p; and p; under the interference of po. 


2.6 Balanced Opposites 


It is extremely hard to make any decision when both candidates gain the same number 
of votes in an election. However, for any equilibrium dynamic system involving two 
balanced opposites in competition, the most probable trends are p; = p;. In general, 
more complicated feedback mechanisms are involved and balanced events occur 
more frequently [10, 11]. 


2.7 Four Additional Policies 


To resolve conflicts in an election, four additional policies may be useful: reducing 
error probability (po — 0), merging other candidate votes (V;| UV; —> V; or V;UV; > 
V;;i, j, l € [1, n]), re-election (new p;, pj) and court decision. 

The reducing error probability policy works well in certain conditions involving 
only a small number of electors. Using various controlled methods, e.g., the total 
number of seats in Parliament being an odd number or some additional votes allowed 
by Parliament Leaders, the worst case scenario where both candidates hold equal 
votes without a decision can be eliminated. However, when an election involves a 
large number of electors like sizes of the 2K-election, the voting system becomes 
a naturally complex dynamic system and there is no way to make the error margin 
(po — 0) negligible. 

The merging other votes policy works in simple conditions at a single location. 
To combine votes for candidates from multiple locations under approval rule would 
be more difficult than under plurality rules since there are many overlaps among sub- 
polls. There is no guarantee to ensure the policy work. In the best cases, old difficulties 
may be temporarily solved, but new similar uncertainties could immediately emerge. 

From a complex-dynamic system, re-election is as same as the original elec- 
tion. Therefore, the re-election policy cannot provide improved separable property 
between two candidates. 

If other solutions can not be found by timing or other issues, then it is feasible to 
use Courts to make decision. The court decision policy uses Courts to make decision, 
it results in efficient decision-making but breaks down the election procedure and 
it may loose fairness, transparency, self-determination and other advantages of the 
election process. 
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2.8 How Accurate Is Accurate? 


It is well known that all measurements in physics and in all exact science are inaccu- 
rate in some degree. So, what then is sufficient to be deemed accurate for an election? 
Can we accept a 10% margin of error to be accurate? What about 1% or even 0.1%? 

In real life, an error margin of 1% would be highly commendable and one of 0.1% 
would be considered highly accurate. 

Although, voting and polling were not meant to be an exact science, polls and other 
pre-election statistics had error margin of almost 5—10%. Yet in the actual election, 
the margin of error was less in the disputed counties, e.g. Miami-Dada and Palm 
Beach, only 14,000 votes from a total number of six million votes were rejected. 
The margin of error was only 0.233%. Usually, this would be deemed a negligible 
number, as almost 99.8% of votes were valid. However, it was not enough to separate 
the two candidates, this margin would have to reduce the rejected votes from 14,000 
to 100. In the condition, at least an error margin of 0.00016666% is required. This 
is highly improbable due to the cost, time and other factors. 


2.9 Shifting Attentions from Invalid Votes to Valid Votes 


Almost 99.8% votes are valid. This indicates that in order to determine who will be 
the winner under the uncertain condition, it is necessary to fetch additional infor- 
mation to determine a victor from valid votes instead of reducing the error margin 
by handling invalid votes. The total number of votes is far greater than the number 
of candidates. This makes possible to extract additional information using cross- 
classification methods based on contingency table-like techniques among multiple 
categories. The cross-classified technique is a powerful toolkit in modern statistics 
[12, 13, 14, 15]. 

Under additional categories such as location, age group and sex, valid votes will be 
categorized as two-dimensional classified feature distributions in respective contin- 
gency tables. Such spatial or histogram-like feature distributions provide invaluable 
information to support improving separable properties between two uncertain can- 
didates. To represent this idea, a new model is proposed in next chapter. 


3 Component Ballot Model 


To overcome the intrinsic complexities and uncertain problems in approval voting 
practices, a new model—the Component Ballot Model—is proposed in this chapter 
to use multiple variables on a ballot for a better description and an easier comparison. 
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3.1 Definitions 


To be consistent with the previous notation, similar symbols (ballot paper) are used. 
However, the contents of the ballot paper and other notations will be compounded 
into vector forms. 


Let C = {C,, Co,..., Cm} be a set of the selected conditions. The i-th item 
contains n; distinct values for selections, C; = (ci, Pee c, Sas ca J€U,nil,i € 
[1, m]. 

A ballot B (or a component ballot) is a vector composed of m items: 

1 
c Risee.4 
B=|&|= CEE eecsen , jel,nlicel,m]) GD 
Cm 


Component items in a ballot provide additional information about elector to the 
paper such as sex, voting time, location, age group, and minority, living area, social 
security and employ situations. 

For example, the first item contains 10 candidates, the second item presents 
100,000 locations, the third item has 3 sex groups (male, female, neutral), the forth 
item contains 150 age groups, and the fifth item indicates 10!° social security number. 
Under above conditions, a ballot paper could be 


1 1 
Cisk c 
KARET 
Ci 
2 2 
© CTs «+++ CTo0000) 
= = 3 3 3 
B=|]CQ |= Gia) ; 
C4 4 4 
Cries exe s0) 
Cs 5 5 
Giese ne) 


m = 5, nı = 10, n> = 100000, n3 = 3, n4 = 150, ns = 10!°. 


A vote v (or a component vote) is a record of a component ballot B for which at 
least one value for each m items has been assigned: 
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v! (vt, ,U,,) 
v=| vi |=| (vi.....uf,....u,) |, uf e {0, 1, x}, l € [1, a], i € [1, m]. 
vu” (UE asst | 


(3.2) 


where n; is the upper limit of v’; vi = | (or 0) means g candidate selected (or not 
selected), v; = x indicates c; being an invalid value. 

More items are provided for each ballot to include more information. Further 
distinctions of their valid regions are necessary. If for a vote v, the first item 
satisfies i = 1, pase v} > ](more than one values selected) and all additional 
items satisfy vi e {0, 1},/ e [l,n;],i € [2, m], fe vi = 1 (one and only one 
value selected), then the vote v is a valid vote. However, if Ji, L, vi E {x},i € 
[1,m], l € [1,n;] or there is one v’ in additional items assigned multiple values, 
(ai, vi € {0, 1}, Joe vi > 1, e[l, n;],i € [2, m]) then v is an invalid vote. 

Normally the valid first item in a vote has a value region from (0, 0,...,0, 1) 
to (1,1,..., 1). A total number of 2”! — 1 combinations are valid to allow one, 
two or more candidates selected. However, for other additional items there is one 
and only one value selected from (0, 0, ..., 0, 1) to (1,0,..., 0,0). There are only 
ni,i € [2, m] selections allowed. 

Additional information for electors may been accessed from existing election 
databases somewhere, there is no any technical difficulty to merge them to be a 
compound vote automatically using modern information technology. 

There are enough rooms for an elector with various parameters on a vote and a 
total number of N electors in voting. 

A poll V is a vote collection in which all votes can be arranged as an array with 
N entries: 


V = (v(1),..., v(t), ..., V(N)), te[l, N]. (3.3) 


Considering each vote has m items, a poll V can be represented as a 2D mx N 
array. 


V =(v(1),..., v(t),..., (NY) 
v'(1) v(t) v'(N) 
= vil) p | vi@) he | vi) te[1,N],i € [l,m]. 
v” (1) v” (t) v” (N) 


(3.4) 
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3.2 Feature Partition 


Let Vc denote a valid poll and Vp denote an invalid poll, Vc and Vo partition the poll 
V i.e. 


Vc = {Vv|visa valid vote, v € V}; 
Vo = {Vulu ¢ Vc, v € V}; 
V = Vc U Wọ. (3.5) 


Let V! denote a sub-poll in the election. For any i € [1, m], VÝ collects all valid 
votes of the poll V for the ith item. 


Vu(t)|v(t) € Ve, vilt) € {0, 1}, 5 MORE 


l=1 
lLef[l,n;],t ¢[1, N], i € [1,m]} (3.6) 


Zero-D Feature Lemma All { vi 4 sub-polls contain the same votes as in the poll 
Ve: 


Ve=V! =V? =.= V=. Vv" (3.7) 


Proof Using Eqs. (3.5) and (3.6), a valid vote contains at least one valid value in 
each category. No difference exists to project all valid votes as one group. o 

Let V; denote a sub-poll in the election. For any i € [1, m], V; collects all valid 
votes of the poll Vc for the ith item in a special location k. 


Vi = {Vo(Nlu@) € Ve, y= 1,t € [1, N], i € [l,m], k € [l,n;]} (3.8) 


One-D Feature Lemma All {V} } sub-polls dissect a sub poll V': 


ke[1,n;] 
Nj 


v'=|Jvi 3.9) 


Proof By Eqs. (3.5)-(3.8), each vote has at least an identified value. To collect all 
votes with the value, we have the result. 


One-D Feature Corollary If each vote contains only one value in the category item, 


then all sub-polls {Vi } kell,n:] partition a sub poll V: 


Vil= $ ivil (3.10) 
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Proof By Eq. (3.9), each vote has an identified value. There is no overlap among 
possible sub-polls in relation to the category item. o 
It can be noticed that only candidate category does not satisfy one-D feature corol- 
lary under approval voting rule. Other additional categories satisfied the condition. 
Different from the Zero-D feature lemma, the One-D feature corollary provides 
non-trivial partition of the votes into multiple sub polls. 
Let V? denote an invalid-poll in the election. It collects all invalid votes of the 
poll V. 


V° = {Vu(t)|v(t) ¢ Ve, t € [1, N]} (3.11) 


Since there is no any further distinction for votes in V°, all votes in this poll 
correspond to discarded votes. 
Let V£} denote a sub poll. It can be described as 


Vel =[WwOWo E Ve, vi(t) = 1, v(t) = 1; 
rel, N], i, j € [l,m], k € [1, n;], l € [1, n;]} (3.12) 
For any i, j € [1, m], k € [1, n;], l € [1, nj], collected votes of V;’/ are the same 


as the votes in V;’7’. 


If] Æ k, then votes in Vey are different from the votes in vi a 


Two-D Feature Lemma All votes in [vii I ibaa dissect either V} or vý ` 
3 €[1,n;],le| 1,1; 
vi =| J vėj; (3.13a) 
i=1 
or 
vý =| lve. (3.13b) 
k=l 


Proof By Eq. (3.12) and one-D feature lemma, each vote in the sub-polls has other 
identified values. To collect all votes with the value in relevant sub-polls, we have 


the result. o 


Two-D Feature Corollary If a valid vote contains a single value in the selected 

category item, then all votes in [vi 7 | partition either V; or V; . For j 
* Skefin;]le[1n;] 

category, 


nj 


Mi= 


l=1 


viji; (3.130) 
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Or for i category, 
\vé| => |viil. (3.13d) 


Proof When each vote in the sub-polls has only a single value in relation to the 


selected category item, the sub-polls partition the selected poll. o 
. „i jell,m] 
Under this construction, all votes in | Vii | 


dissect the valid poll Vc. 
ke[l,ni],le[l,n;] 


When single value condition satisfied, sub-polls can partition the valid poll. 


3.3 Feature Matrix Representation 


For a given pair i, j € [1, m], let k corresponding to row number and / corresponding 


to column number, for a given į V;’/ sub polls, there is a unique feature 
> Jkefll,ni]lell,n;] 


matrix representation. 


3.3.1 Feature Matrix 


Let V'/ denote a feature matrix, 


vii = | vii My. Vel |. kell, mille [1 nj] (3.14) 


ij ij ij 
Vanl aad Vai EEF Vain; 

Using a statistical language, a feature matrix Vi} may correspond to a contingency 
table based on cross-classified categorical data under two selected categories [13, 
16, 17]. Each element of the matrix collects a sub-set of votes in a respective cross- 
categorical meaning. 


3.3.2 Feature Matrix Set 


J 


: ig i, jell,m] 
For a given [v | 


m 
, there are a total number of 2 * =m * (m—1) 
ke[l,n;],lell,n;] 2 


distinction feature matrixes. It is composed of a matrix set VS, 
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VS = {Vli j € 1, m)}. (3.15) 
For a given pair i # j,i, j € [l,m] in the set, each [vei] or 
ke[1,nj],le{1,n,] 
| Ve | corresponds to a unique matrix or its translation matrix. However 
> Jkefl,nj],lell,ni] 


a given pairi = j,i, j € [1, m], the matrix is equal to its translation matrix. So there 
are a total of m * m — m different matrix representations. 
For a fixed item (e.g. i = 1) as the first index, there are a total number of 


m 
m = i different matrices in the system to record different relations among 


ij i,j¢[1,m] 

[v J | sub polls. 
> Jkefl,ni]lell,nj] 

Let V SC(i) denotes the matrix set with first index fixed at i, 


VSC(i) = {Vj € [1, m]}. (3.16) 


Selecting one category for both row and column values, for a given V SC(i), if 
Vei €e Vbi in VSC(i), a vote in the i th category contains only one valid value, then 


vi 7 can be determined as following. 


, ifk #l; 


ye k,l €[1,n;],i € [1, m]. 3.17a 
ER apy [I,m]. € [l,m] (3.17a) 


In this case, the matrix V>’ is a diagonal matrix. 
However, if V7, € V>’ in VSC(i), a vote in the i th category contains multiple 


distinguishable values, then [vei] provides cross-classified sub-polls. 
v= Vv, V =( ve =(] Vi kle, nilie, m].  (3.17b) 
l=1 l=1 


In this case, the matrix V"' is a symmetric matrix. 
For a given VSC(i), V£} € V's in VSC(i), following equation is true. 


vi =|] Vč} ke Unite Uni je Um. (3.18) 
i=] 


3.3.3 Probability Feature Matrix 


Let P‘/ denote a probability feature matrix corresponding to the matrix P’/ and 


| pij} denote its element set, for any pi e Ph, 
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ee P . . 
[Vei Vgl; Ve ADs 


oe ae | (3.19) 
0, Vize. 


ig ig hj 
Pig +++ Pr +++ Pin 
PU S| pph so Pe o ph, | kell nll el, nj] (3.20) 


. Prin j 
For example, nı = 6, n2 = 4, a probability feature matrix can be as follows: 


0.04 0.26 0.1 0.6 
0.42 0.2 0.3 0.18 


i 0.14 0.21 0.42 0.23 


0.008 0.022 0.75 0.22 
0.33 0.01 0.23 0.43 


3.4 Probability Feature Vector 


For any P'-/, only at most n; row vectors in the matrix need to satisfy Eq. (3.22). 


nj 


D t e [l,m]; 1 € [l n;]; i] € Hm]. (3.22) 


The Eq. (3.22) can be established from Eq. (3.13c), if the column items partition 
the sub-polls for the given row. 

Because there is not any restriction among the columns of the probability feature 
matrix P’/, such properties make flexible select different categories partitioning a 
given vote set } p; | into multiple distributions in larger selection spaces to satisfy 
complicated dynamic system requirements. 

For a given P'-/, if the ith item is a categorical index of candidates, then any 
candidate k € [1, n;] has a probability feature vector corresponding to its probability 
densities relevant to item j and denoted by W,/ 


Wi = (pe. ed eh, hke LL mille [Labi fe Um) 6.23) 
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3.5 Differences Between Two Probability Vectors 


Let {Vi hen n,) SUb-polls denote a vector Vi = (Vo, Vis... Vivien VI) L € 
[1, ni], this vote vector corresponds to a probability vector 
Wi = (B°, pi,..., Bi,..., Bi), L€ [l mil, let 
Pi = IVi I/QVÝ IH Vo) = Ni/N,1 € [1, n:i] (3.24) 
and 
p° = |Vol/(IVŻ I+ Vol) = No/N,i € [l,m]. (3.25) 
Let {Vi bien ,,) Sub-polls denote a vector V =V ers rer Vi), le[l,nj] 
and 
pi = |VÍI/IV I= Ni/(N — No), l € [1, ni]andi € [1, m]. (3.26) 


A vector V’ is corresponding to a probability vector Y, 
Yi = (pis Ph o Ph) LE [L n]. (3.27) 


If the ith item of a vote indicates an ordinal number of candidates in an election, 
a probability vector W' isa special case of a linear spectral distribution. 

For any /th candidate, if 1 > pi >> p° > 0, then pj = pi. 

Considering the difference between the two probability measures, 


pi — Bj = Ni/(N — No) — Ni/N 
= N/No/N(N — No) 
= N,/(N — No) x No/N 
=p; x ° >0>0. (3.28) 


Equation (3.28) indicates that the probability measure of invalid votes is small 
compared with the candidate measures. There is no significant difference for both 
probability measures ĵi and pi for a candidate in two probability vectors W! and Yi 
respectively. 

If any /th and gth candidates gain a similar number of votes in an election to satisfy 
the uncertain condition, then the difference between both probability measures pi 
and pi are restricted by the uncertain condition too. 

Considering probability measure difference under uncertain condition, their dif- 
ference is 
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[Pi — Pel = 1B; — Pi + pi — By + Py — pil 
= |Pi — Py — (Pi — Pi) — (P; — Po) 
= |p} — p + (pi — Pi) + (p; — P) (3.29) 


> 
= 


"(pi — Bi) + (pi, — Pi) = (pi + pi) x B= 0, (3.30) 
Pi = P| + (Pi + Pe) x B° < [Bj — BE] + (Di + Pe) x B° < p + (Dj + Py) x B 
zo [pp — pi | <3 x p’. (3.31) 


Equation (3.31) indicates that the new probability vector does not solve the uncer- 
tain problem. To overcome the difficulty, other techniques need to be employed. 


3.6 Permutation Invariant Group 


For any yii , a permutation invariant group ¥(i, j|k) can be constructed to collect 
vectors using all elements in yy,’ as constructors of possible permutations. 


3.6.1 Feature Index and Permutation Invariant Family 


For a vector & € W(i, j|k), if it is feasible to define a numeric measure (or feature 
index) and all vectors V® € W(i, j|k) have the same index, then the feature index À 
is an invariant of Y (i, j|k). 

For YỌ e€ Wi, j|k), 


{BA|A(®) = (BE) = c, © A E; Q, B € WG, jik), k € [1 nil,le [1,n;] i, j € i, ml} (3.32) 


3.6.2 Polynomial Feature Index Family 


For any probability vector Ų = (pi, sas DTs sess Dm) with m items and dk € 
[1, m], px > 0a family of polynomial indexes {A,,} is defined by Eqs. (3.33)—(3.36). 


ho) = È (pi)? =m; (3.33) 


l=1 


m 


A) = Do (p) = 1; (3.34) 


l=1 
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m 


dah) = 0 (ps (3.35) 


l=1 


A (Y) = D5 (pD, n = 0. (3.36) 
i=l 


For example, using the sample probability matrix P 1? of Eq. (3.21), its polynomial 
indexes {i,,} are 


4 0.437616 
4 0.3388 
4 1 0.293 
Ag(P!) = ; (P = ; A(P™) = 
4 0 0 
4 1 0.611448 
4 1 0.3468 
0.23464 
0.11492 
0.090664 
(P1?) = oe 
0 
0.43253416 
0.127612 
3.6.3 Entropy Feature Index 
For a probability vector Y = ( Pies xe Pjr Pm) with m items, an entropy feature 
index Àg is defined by Eq. (3.37). 
Ae(V) = — È pi * (pi). (3.37) 


l=1 


In polynomial index family {A,(V)},,>9, 40(W) indicates the length of vector and 
A1(W) provides the normalized measure. In addition to {A,,(V)},+9 family, Ag (¥) 
provides another type of indexes in relation to the entropy measurement. Using one 
of these indexes, it is feasible to distinguish two probability vectors in different 
permutation groups. 

For example, using the same probability matrix P!? of Eq. (3.21), its entropy 
index Àg is 
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1.015748065 
1.356003379 


1.305367539 
Ag(P'”) = 
0 
0.6714638476 


1.113842971 


3.7 Two Probability Vectors and Their Feature Indexes 


Two probability vectors yii and yii , have two distinct index families 


[anf y PCer), and az, ACHE A A, 1 < tS aotr’) 
n>0 n> 
then the two vectors belong to two different permutation groups. 

For two probability vectors Y,” and W,” , each vector belongs to one permuta- 


tion group and cannot be generated from another vector then dn > 1, Àn (wi ) # 
m ( Wt), len< ro(¥7"). 
Under such conditions, if two vectors have different index families, then they 


are in different permutation groups. In another way, when two vectors cannot be 
generated from another one, at least one indexes is distinguishable. 


3.8 CBM Construction 


Let CBM denote a Component Ballot Model. A CBM is a collection of a ballot 
form, vote sequences, poll and poll component matrix collection, probability matrix 
collections with normalized probability vectors plus the selected indexing family for 
an election. 


CBM = (B|V, VS, {P’“}, {Ai}). (3.38) 


Compared with SBM (Eq. 2.3) and CBM (Eq. 3.38), it is clear that the SMB is 
the simplest case of CBM and CBM provides more powerful properties for refined 
descriptions and comparisons in complicated voting applications. 


Two-D Separable Proposition For two candidates to gain similar number of votes 
in the uncertain condition, it is always feasible to use other categorical information 
(i.e. location, age group) to re-partition sub polls for each candidate. If the two refined 
probability feature vectors belong to two permutation groups, then the uncertain 
problem can be solved in most case scenarios by using the polynomial feature index 
family or the entropy future index. 
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Proof For most case scenarios, cross-classified categorical data make corresponding 
probability feature vectors with significant differences in relation to respective den- 
sity distributions. Under different categories without simple correspondences, this 
mechanism makes it possible to use the same strategy to handle votes for candidates. 
Since one party may be very strong in certain polices and relative weak in other 
strategies, those differences create various probability feature vectors easier located 
in different permutation groups. Even in the most balanced election events from a 
global viewpoint, hugely distinguishable distributions exist in local regions. This 
is the most important reason for two probability feature vectors making a pair of 
significantly distinct feature indexes. o 


In a complex dynamic system, equilibrium is the most probable state when the system 
is in dynamic balance. However, there are significant differences among local areas 
even in the most equilibrium conditions. This is the most powerful part of proposed 
model for solving uncertainty in general for complex dynamic systems. 

For an election to avoid uncertainty and frustrations due to the voting result in 
uncertainty, it is necessary to pre-select additional odd m — 1 > 1 categories different 
from candidates. Following main conclusion can be statement. 


Voting Authority Proposition If two candidates in an election under approval rule 
are in uncertainty, then additional categories (odd m — 1 > 1) under pre-agreed 
conditions could be used. These create the m — 1 pairs of feature indexes for making 
the decision for who will be the winner. 


Proof According to the two-D separable proposition, each additional category can 
provide a pair of significantly distinct feature indexes to separate the two candidates, 
and all selected m — 1 pairs have such properties. Considering m — 1 an odd number, 
each pair of indexes acts as an authority vote. So, there is no problem using the 
majority rule to make the decision. 


4 Conclusion and Further Work 


In the proposed Component Ballot Model, multiple probability-feature matrix collec- 
tions are employed and component categories other than the candidate are proposed 
on ballot papers to overcome confusion and frustration when two candidates are in 
uncertainty. 

Applying advanced invariant constructions to probability feature vectors and also 
distinguishable properties among measurements in polynomial and entropy feature 
index families, voting authority provides a stable indexing mechanism to make the 
whole calculation based on valid votes. Distinguishable properties and invariant 
properties among feature index families provide reliable measurements for election 
outcomes. 

The basic ideas, tools and technologies in the chapter are originated and created 
from the author’s research works in 1990s for advanced content-based information 
retrieval and image feature indexing [18—20]. 
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Because the approval rule is only one of the rules in practical voting systems, reader 
may read author’s other paper discussing related aspects of voting theory under plural- 
ity and majority rules [21]. It is interesting to know whether the proposed new model 
can apply to other voting systems (such as Borda rules, proportional-representation 
system and preference voting systems) consistently. Similar uncertainty exists in 
other voting mechanisms. This will be a natural extension of current study. 

To satisfy practical voting systems, it is essential to establish testing frameworks 
to make recommendations for the specific invariant properties contained in the pro- 
posed or new indexing families. There is no doubt that different voting systems may 
require various combinations of different feature indexing schemes to satisfy their 
optimal properties. More case studies linking between theoretical models and prac- 
tical applications should be conducted to solve complicated voting paradoxes and 
other similar problems. 
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Part V 
Applications—Global Variant Functions 


The only thing permanent is change. 
—Immanuel Kant 


The scientist needs an artistically creative imagination. 
—Max Planck 


The thought: A logical inquiry. 
—Gottlob Frege 


Extensive researches were focus on global function and their distributions published 
in the period of 2000-2010. Conjugate transformation and content-based image 
retrievals are typical examples for development. Using a hierarchical architecture of 
knowledge model, multiple levels of balanced structures were developed in both 
image analysis and processing, e.g., Towards Automated Mammographic Image 
Analysis, Proceedings of the 2005 IEEE International Conference on Information 
Acquisition 85—90, and content-based retrievals, e.g., Mixed Query Image Retrieval 
System, Proceedings of the 2007 IEEE International Conference on Information 
Acquisition DOI:https://doi.org/10.1109/ICIA.2007.4295776. 

Associated with variant logic and various applications, wider explorations were 
carried out in the fields of cellular automata functions under different symmetric 
conditions that were examined. For example, Permutation and Complementary 
Algorithm to Generate Random Sequences for Binary Logic, International Journal 
of Communications, Network and System Sciences 4(5):345-350, 2011. 

This part of global variant functions is composed of five chapters (11-15). 

Chapter “Biometrics and Knowledge Management Information Systems” describes 
a hierarchical framework to use concept cell model on Biometrics & KMIS applica- 
tions. Searching for brides and fingerprints was samples of typical applications in 
addition to process on SARS and fingerprint images. 

Chapter “Recursive Measures of Edge Accuracy on Digital Images” uses 
recursive measures to handle image edges under different conditions to compare 
various edge algorithms, edge quality, and their accuracies. Conjugate maps and four 
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other edge schemes {Gradient, Laplacian, Gaussian, Mathematical Morphology } 
were selected. 

Chapters “2D Spatial Distributions for Measures of Random Sequences Using 
Conjugate Maps” to “3D Visual Method of Variant Logic Construction for Random 
Sequence” use variant logic framework to illustrate 2D/3D and visual maps of 
variant logic operations on n = 2 conditions to show global visual distributions in 
their configurations of functional spaces. 


Biometrics and Knowledge Management A) 
Information Systems pieci 


Jeffrey Zheng and Chris Zheng 


Abstract Biometrics and knowledge management information systems are two 
important fields in recent years to attract wider attentions from different social groups. 
This chapter explores the use of hierarchical construction linking with biometrics 
applications and knowledge management information systems. The key issues are 
discussed and a sample case of information acquisition in content-based image 
retrieval system has been illustrated. 


Keywords Biometrics - Complexity - Hierarchical organization - Feature 
classification > Content-based image retrieval 


1 Introduction 


Biometrics has attracted people attention in recent years due to terrorist attack and 
rapid scientific development and advanced information technology. In the twenty- 
first century, one of the most significant achievements in biology decodes a full 
list of gene codes of human DNA sequences. Using advanced pattern recognition 
technology, it is now convenient to make real-time face verification and fingerprint 
identification. 
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In general, all quantitative measures of living objects and activities from different 
sources including biology, anatomy, sound, photo, electronics and nerve pulse could 
link to biometrics. In such extremely complicated fields and areas, if we can effi- 
ciently acquire essential information to be manipulated by knowledge management 
information systems, then this mechanism will play an important role in the prac- 
tices of applied biometrics. Useful concepts, methodologies and software/hardware 
toolkits in the direction will be invaluably helpful biometric applications in practical 
environments. 

To resolve real-world problems, it is useful to apply system engineering schemes 
using analysis and synthesis mechanisms. In this chapter, hierarchical construction 
will be used as a framework to represent biometrics and knowledge management 
information systems. The original concepts and methodologies used in the chapter 
come from an established theoretical construction of dynamic systems conjugate 
classification and transformation [1-3]. Main algorithms and methods from the con- 
cepts have been implemented into software packages in advanced image analysis, 
content-based image retrieval and image understanding systems. 

Using these concepts and methodologies in biometrics is a new application. The 
author would like to have this opportunity to sincerely discuss the possibility with 
other experts of the field in detail. 


2 Different Complexity Issues in Biometrics Applications 


Different measurement may have variant forms and contents in practical biometrics 
applications. In a measure space, measure data set can be relevant to length, position, 
angles, time and other basic measurable quantitative. Using dimension number of 
geometric spaces representing different biometrics objects has been shown extremely 
useful in many applications. Very rich contents can be observed through representa- 
tives of biometrics measures. 


Infrared Detector for SARS detection (1D body temperature > 38°C) 

In protecting SARS virus distribution process, infrared detectors installed on the 
major channels of airports, stations and customs played active roles in indirectly 
measure body temperature whether higher than 38°. This process has significantly 
reduced the SARS virus fatal distributions. 


DNA sequence (1.5D sequence) 

A DNA sequence is composed of four types of gene codes forming of conjugate 
pair linear structure. Since the sequence itself has very complicated combination 
characteristics and also local grouping properties, this makes structure much more 
complex than simple 1D linear sequence [4]. 


Face identification and early breast cancer detection (2D) 

In most image analysis systems, especially face identification and early breast cancer 
detection systems use of 2D features in manipulations. In larger applications or data 
sets, those feature spaces are very complicated. 
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CT scanning and reconstruction (3D and higher D) 

Using modern CT scan medical imaging equipments, it is feasible to reconstruct 3D 
images from multiple 2D image slice sequences to represent complicated projection 
and dynamic properties of interested areas and organs. 3D visualization has much 
more complicated properties than 2D image visualization process. 


Retinal analysis and synthesis (higher D nerve network) 

The detailed principles of retinal nerve network in human vision is not fully under- 
stood. But their biological structures are well recognized by interconnected nerve 
networks. This type of connectivity is much higher than three dimensions. The cor- 
responding symptoms of distributions among brain surfaces and visual simulations 
indicate hierarchical structures in optical nerve systems naturally [5]. 


Abstract Thinking (Super Hypercomplex Cells) 
The capacity of abstract thinking may belong to super hierarchical organizations of 
nerve systems. If there are real nerve objects, this structure could be super hyper- 
complex cells or their superposition on extensive hierarchy [5]. 

From a certainty viewpoint, lower dimension cases have more certain properties 
than higher dimensions. In addition, higher dimension structure expressed abstract 
properties with more variables and richer possibilities in real-world cases. 


3 Proper Concepts, Methods and Useful Toolkits 


Using modern mathematical toolkits, concepts and methods such as geometric topol- 
ogy and combinatorial topology, it is feasible to use basic analysis on neighbourhood 
relationship of kernel structure to partition complicated systems into non-reducible 
invariant characteristics base family. Using non-reducible bases as generators, it is 
possible to apply synthesis techniques to rebuild complicated systems in certain 
forms [6]. In invariant and singularity analysis relevant applications, global topo- 
logic characteristics play core roles using modern mathematics analysis toolkits [7]. 
Since connectivity belongs to one of the topological properties, higher dimensional 
geometric problems could be represented as graph problems or other forms to use 
common probability and statistical methods for practical calculations to resolve the 
equivalent problems in certain degree [8]. It does not matter how to represent a cer- 
tain problem in detail, and abstract concepts could be always represented as lattice 
structures. 

After systematic analysis of modern knowledge management information systems 
in concepts, principles and operational levels, a useful kernel structure Concept Cell 
Model for knowledge management using directed acyclic lattices in hierarchical 
constructions has been proposed for base construction toolkits of representation [9, 
10]. The model can distinguish two similar lattices of three essential concept levels 
in different abstract structures as building lattice constructions: 
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Time Invariant Structure: Descriptive Knowledge Lattice (Tacit, Implicit, Explicit) 
Time Variable Structure: Procedure Knowledge Lattice (Start, Operation, Finish). 


Undertaken hierarchical construction, it is convenient and efficient to represent 
knowledge systems in information request, abstract representation, categories, orga- 
nization and other statistic and dynamic application requirements. 

Concept cells in hierarchies can efficiently represent from real measurement data 
sets to higher levels of conceptual networks to represent application systems as mul- 
tiple levels of organizations. This provides an operational knowledge management 
framework to flexibly support from user cases, abstract design, and implementation 
and operation requirements for system engineering practices. By applying concep- 
tual categories, it is feasible to construct useful application systems with powerful 
self-organization and self-learning capacities in wider engineering and social envi- 
ronments. 

To easily understand the main point, it is convenient to show an example to 
represent a partial structure in implemented content-based image retrieval systems 
using hierarchical concept structures shown in Fig. 1. 

In the construction, a single index represents specific content-based information 
extracted from an image. A set of images needs to correspond to a set of indexes, 
respectively, and is organized as a list. It is convenient to use a multiple hierarchy 
to organize the list of single indexes as its end nodes. Each intermediate node can 
be established as a group of multiple indexes with strong similarity properties in 
their contents as a combined index. By this way, a root node can be established by 
combined individual nodes and intermediate nodes to be the representative of the 
whole set of indexes. Three types of information can be distinguished as follows: 


i [ Index | ' { Descriptive Lattice } 


d--= 


` x Root index Tacit 
Single index 


s 
~ Implicit 
Combined Index 
¢»=-=-- 
, % 
Linear Order var 
Single index Explicit 


Fig. 1 Descriptive lattice in hierarchical representation 
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Single index: individual information explicit 
Combined index: group information implicit 
Root index: whole information tacit. 


Using descriptive lattice structure in multiple levels of representations, compli- 
cated content-based image retrieval system can be mapped to a multiple layout net- 
work structure. It provides efficient organization to do information acquisition and 
organization linking with individuals, groups and the whole in information network 
construction. 

While search operation, the current index will check from root (tacit node) to 
get the best match through combined indexes (implicit nodes) and single indexes 
(explicit nodes) to obtain the best-matched cases in hierarchy. Using best match 
information, a selected image group will be determined as output results. 

In Fig. 2, two sets of implemented results on brides and fingerprint verification 
are provided to illustrate visual qualities of retrieved output results. The 125th bride 
image is selected and a list of similar brides as retrieved results. The 194th fingerprint 
image has been selected as a query example, and the output result is shown in right 
panel and arranged by similarity from higher to lower values in relation to the best 
20 matched images from the image database in which the 194th, 193rd and 195th 
images are strong relative fingerprints from the same person. 

Two sets of image processing results are shown in Figs. 3, 4 and 5. In Fig. 3, four 
enhanced results on an original SARS image are selected. In Figs. 4 and 5, various 
results of a fingerprint image are processed in different parameters under special 
enhanced functions. 


4 Demand in Future Society 


From biometrics measure viewpoint, measure data itself can be very accurate and 
crystal certain as numeric values. However, through hierarchical construction, more 
uncertainty will appear as higher level contents. Complicated interconnections will 
be linked with simply single measures to complicated global organization. Using 
hierarchical construction, it is feasible to organize single, group and whole informa- 
tion through network construction to cover wider applications. 

In rapid development of web-based network, high-speed interactive facility and 
quick connections have changed traditional concepts and methods significantly. It 
is a convenient approach to use knowledge management information system to do 
information acquisition, intelligent analysis, combination and synthesis. 
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(d) (e) 


Fig. 3 Four image enhancements on SARS image (a-e); a Original; b Positive enhanced; ¢ Valley 
enhanced; d Hill enhanced; e Negative enhanced 


Hierarchical operations become the most advanced parts of optimal control and 
best operational strategies. In the current application environment, fast, convenient 
and efficient design and implementation can get wider applications in many fields. It 
can be expected to use automatic and intelligent methodologies to complete compli- 
cated issues, especially on complex and time consumed design processes. Facing of 
many practical applications, simple and unified concepts can help larger dynamic sys- 
tem in forming stable structures. Global interactive connection and their evaluations 
will be helpful for social environment in high speed and sustainable development. 
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Fig. 4 Four image enhancements on fingerprint image (a—e); a Original; b Positive enhanced; 
c Valley enhanced; d Hill enhanced; e Negative enhanced 


5 Base Strategy of Development 


Any theoretical scheme cannot ensure itself in practice operations successfully with- 
out carefully matching environment requirements. In current social and economic 
conditions, it is more important for biometrics to make a positive impact on social 
economy to help the existing developments. Market-oriented mechanism can be used 
to resolve key problems in applications. It is most important to identify core tech- 
nology in the application and collect the required energies and resources to attack it 
resulting in significant impact. 

In knowledge management information systems, content-based acquisition, repre- 
sentation, indexing and retrieval components are the core components for automatic 
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(b1) (b2) 


(cl) (c2) (c3) (c4) (c5) 


Fig. 5 Ten enhanced results of a fingerprint image (a—c); a Original; b1—b5 Hill enhanced; c1—c5 
Valley enhanced; b1/cl a = 30; b2/c2 aw = 80; b3/c3 œ = 128; b4/c4 a = 160; b5/c5 œ = 220 


organization and high-efficient retrieval. Ultra-fast and accurate retrieval technology 
for databases and meta-knowledge bases can be widely used in many applications to 
satisfy information acquisition, extraction, categories, and organization, storage and 
retrieval requirements. Under global web-based environment, hierarchical organiza- 
tion of knowledge management systems and biometrics will be further refined and 
developed in health environment. 
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Recursive Measures of Edge Accuracy A) 
on Digital Images get 


Jeffrey Zheng and Chris Zheng 


Abstract In this chapter, an edge accuracy model is proposed on digital images and 
five types of edge detection methods are discussed as examples to investigate their 
edge maps undertaken recursive operations. Using invariant criterion, it is possible 
to compare different schemes in accuracy, consistency, completeness and simplicity. 
This provides general mechanism in relation to accurate edge extractions from digital 
images. 


Keywords Edge detection - Accuracy - Invariant - Digital image 


1 Introduction 


Edge detection plays a fundamental importance in image analysis, processing and 
computer vision applications. As the first step of visual perception, extensive R&D 
has being focused for 40 years (more than forty thousand years—drawing arts in 
human civilization). Many useful edge detection operators have been invented and 
applied in wider applications. 

From an operational viewpoint, edge detection creates edge maps from images 
shown in Fig. la. Edge detection operators identify significant changes from visual 
objects as their edges or contours. From a historical viewpoint, common edge detec- 
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Fig. 1 Recursive edge extraction. a Edge detection; b Recursive edge maps; c Edge accuracy 
measures 


tion approaches are divided into five approaches. Traditional edge detections have 
three main categories: Gradient, Laplacian and Gaussian; another two categories are 
mathematical morphology and conjugate. The five categories will be briefly intro- 
duced as follows. 
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1.1 Gradient 


Gradient scheme has a direction corresponding to convolution operations; we can use 
2 x 2, 3 x 3 matrices or more complicated schemes to construct relevant operators, 
for example, Roberts operator uses 2 x 2 matrix to detect edges on main diagonal 
or anti-diagonal directions. Prewitt, Sobel and Isotropic schemes take 3 x 3 matrices 
using different parameters to extract horizontal or vertical edges from digital images 
shown in Fig. 2a. 


1.2 Laplacian 


A typical Laplacian scheme is Marrs—Hildreth’s zero crossings. This scheme uses 
the second differential information to determine zero crossings of the edges shown 
in Fig. 2b. 


1.3 Gaussian 


Canny edge filter plays a significant role in advanced edge detection applications from 
late of 1980s. This scheme applies Gaussian smoothing filter first, then gradient oper- 
ations and finally thinning processes and its final results shown in Fig. 2c. Different 
from Gradient and Laplacian schemes, Canny edge detection provides controllable 
parameters to balance noise levels and significant edge components. Because of its 
controllable properties, this scheme widely used in many practical applications in 
relation to significant edge components. 


1.4 Mathematical Morphology 


Mathematical morphology plays an important role in advanced image analysis and 
processing applications from 1980s. Using discrete patterns as morphological masks, 
the method applies erosion and dilation, opening and closing operations on the pro- 
cessed images. This method distinguishes edge and non-edge masks. In general, only 
translation invariant can be retained in operations. Each time of basic operation uses 
one mask on either erosion or dilation corresponding to reduce or extend boundaries 
of the visual objects. There is no simple relationship between the selected mask 
states and edge states. Two edge maps using a crossing mask under either erosion or 
dilation are shown in Fig. 2d. Each edge map has been calculated by either dilated 
or eroded output image subtracted by the input edge map. 
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(a) Gradient 


(b) Laplacian (Zero crossing) 
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Gaussian Smoothing 
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(e) Conjugate (Negative & Positive maps) 


Fig. 2 Different edge detection methods. a Gradient; b Laplacian; c Gaussian; d Mathematical 
morphology; e Conjugate 
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1.5 Conjugate 


Conjugate scheme has been developed from 1990s and based on a full pattern classifi- 
cation of nearest neighbourhood relationship of discrete states on regular plan lattices 
under rotation, reflection and translation invariants. This approach can express local 
patterns into invariant groups such as isolated, inner, block edge and intersection to 
organise whole pattern space as a hierarchical construction. Both background and 
foreground information need to be represented as balanced structures in conjugate 
phase space. Under certain conditions, it is feasible to use two types of edge maps 
in representations. In Fig. 2e, two typical edge maps are illustrated to use conjugate 
scheme: 


e Negative (White edge map on black background) and 
e Positive (Black edge map on white background). 


From edge detection considerations, different operations provide special proper- 
ties to be emphasised by various visual information from images. Simple convolution 
filters may provide fast process; however, it is highly possible to be sensitively influ- 
enced by minor noise levels. Among three traditional edge detection schemes, Canny 
edge detector shows an important characteristic with a series of controllable edge 
maps in reliable properties. Because distinct edge detectors have different behaviours, 
it is very hard for applicants to make simple selections apply the best one among 
schemes. Mathematical morphology applies discrete masks in operations. Since edge 
maps normally do not correspond to masks themselves directly, it is difficult to estab- 
lish a link from relevant operations and edge detection results. 

Considering edge detection operation extracts edge map from digital images. 
Under this viewpoint, we need to establish a proper model in determining invariant 
properties among edge detection schemes. 


2 Recursive Model of Edge Accuracy 


Different edge detection methods cover various applications with advantages in many 
aspects. From a practical viewpoint, it is hard for users to make proper judgment on 
which method provides the best edge map to satisfy suitable applications. From 
history of edge detection research, no model can provide general mechanism in 
systematic comparison among distinguished methods. Since the target of different 
edge detections creates edge maps, it is natural for us to determine under which 
conditions the edge maps can represent true edge. 
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2.1 Question 


Could an extracted edge map be a true edge representation? 

From a morphological viewpoint, true edge map needs to have invariant properties 
relevant to their geometric and topological constrains. In many theories and practices 
in relation to dynamic systems and cybernetics, recursive methods and models have 
been approved to be a foundational importance in detailed analysis tasks. A recursive 
model has been applied in testing edge detection operators to explore their refined 
properties shown in Fig. 1b. Using this feedback mechanism, edge map needs to 
be looped back again undertaken the same type of edge detection operators. The 
recursive loop shows an important magnification to identify dynamic behaviours 
among input and output pairs directly. 


3 Four Types of Edge Accuracy Measures 


Under the recursive approach, a true edge representation must be the recursive 
edge map itself. Such invariant of recursive operations can be observed as intrin- 
sic properties in relation to the edge detection operators themselves. In addition to 
invariant properties, many rich effects among input and output pairs need to be con- 
cerned. To make proper judgment among recursive results, it is essential to apply four 
different accurate measures shown in Fig. lc. They are {=, ~, 4, Ø} representing 
accurate, almost accurate, inaccurate and trivial behaviours, respectively, between 
input and output edge maps. From matched results between extracted edge map and 
its recursive edge extraction map, it is feasible to determine the category in which 
generated results need to be belonging to. This provides a general model independent 
of a specific edge detection scheme. If anyone would like to check which category 
could be belonging to a special scheme, the person can simply apply this recursive 
mechanism to check specific method itself directly in explorations. 


4 Four Sample Groups of Recursive Edge Maps 


In Fig. 3a—d, four groups of recursive edge maps are generated in illustration. Two 
operators are selected from Photoshop: Find edge (Gradient) Fig. 3a and trace con- 
tour (Zero crossings) Fig. 3b. Find edge operation has a clear variant property, and 
trace contour will have a flip-flop behaviour after certain operations. One example 
is selected from Canny edge detector shown in Fig. 3c. Recursive results of Canny 
operation show that two sets of examples are shown in Fig. 3d for mathematical mor- 
phology. It is interesting to see dilation representing almost invariant properties and 
erosion creating edge map similar to zero crossing effects. To show different recur- 
sive properties of conjugate scheme, four sub-operators are illustrated in Fig. 3e1—e4. 
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Table 1 Edge detection schemes and their accuracy properties 
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Operator Edge quality Noise sensitivity | Accuracy Recursive maps 

Find edge Good, fair Very high + 2 pixels on) 

Trace contour Better, good, fair | High + 1 pixels A, & 

Canny edge Better, good, fair | Controllable + 2 pixels # 

Mathematical Better, good, fair | High + | pixels x, 4,0 

morphology 

Conjugate map Best, better, Full controllable | < 1 pixel =, x, +, Ø 
good, fair True edge 


Each group shows a specific category among three non-trivial results. In conjugate 
edge detection operators, there are two types of controllable parameters that are 
available corresponding to meta-shape parameters {A, ..., L, a, ..., 1} and enhanced 
ratio control {—8, ..., 8}. Both controllable parameters can provide universal edge 
representation on true edge map to support various edge representations undertaken 
selected operations. 


5 Comparison 


Using the five categories, it is feasible to make summary in Table |. This provides a 
systematic way in comparison. 
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(A3) The third edge map (A4) The fourth edge map 
(A1) # (A2) + (A3) #(A4) No invariant edge map available! 
Recursive condition: Directly use find edges filter to each map 

(A). Photoshop: Find Edges (Gradient) 


} 


(B1) The first map l (B2) The third map 


Mite A iicee yin 
(B3) The 53" map (B4) The 54" map 
(B1) # (B2) # (B3) = (B4) Flip flap variations after the 53° operation 
Recursive condition: Trace contour filter (level = 119, edge = low) 

(B). Photoshop: Trace Contour (Zero Crossing) 


Fig. 3 Recursive maps of different edge detection operators. a Find edges; b Trace edge; c Canny 
edge detection; d Morphology; e Conjugate edge detection 
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(C1) The first edge map 


(C3) The third edge map (C4) The fourth edge map 
(C1)# (C2) # (C3) #(C4) No invariant edge map available! 
Recursive condition: Sigma = 1, high threshold = 8, low threshold = 7 


(C). Canny Edge Detection (Gaussian smooth + 
Gradient + Thinning) 


(D11) The first edge map (D12) The second edge map 


(D13) The third edge map (D14) The 4"" edge map 


(D11) # (D12) # (D13) # (D14) Edge maps invariant 
Recursive Condition: Erosion using a crossing mask 


Fig. 3 (continued) 
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(D21) The first edge map (D22) The second edge map 
(D23) The third edge map (D24) The 4" edge map 


(D11) = (D12) = (D13) = (D14) Edge maps almost invariant 
Recursive Condition: Dilation using a crossing mask 
(D). Mathematical Morphology 


(E13) The 100" edge map (E14) The 1000" edge map 


(E11) = (E12) = (E13) = (E14) Edge maps invariant 
Recursive Condition: NM 50 50 10 abcdefghijkl -2 


Fig. 3 (continued) 
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Fig. 3 (continued) 
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(E). Conjugate Edge Detection 


Fig. 3 (continued) 
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6 Conclusion 


Existing edge detections are without unique recursive maps as their representations. 
Conjugate technology provides full controls to create true edge maps in accuracy 
and invariance. 

True edge maps contain unique shape information in fundamental importance to 
support all visual applications. 
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2D Spatial Distributions for Measures R) 
of Random Sequences Using Conjugate giecik 
Maps 


Qingping Li and Jeffrey Zheng 


Abstract Advanced visual tools are useful to provide additional information for 
modern information warfare. 2D spatial distributions of random sequences play an 
important role to understand properties of complex sequences. This chapter proposes 
time sequences from a given logical function of 1D cellular automata in both Poincare 
map and conjugate map. Multiple measure sequences of Markov chains can be used to 
display spatial distributions using conjugate maps. Measure sequences are recursively 
produced by different logical functions generating maps. Possible complementary 
feature exists between pair functions. Conjugate symmetry relationships between a 
pair of logical functions in conjugate maps can be observed. 


Keywords Time sequence - Random property - Cellular automata 
Spatial distribution - Conjugate symmetry 


1 Introduction 


Random sequences are widely used in many security-based applications such as 
security communication, cryptology coding, and information security systems [1]. 
To make proper analysis, Markov chain methodologies and technologies provide a 
series of important methods and tools to help analyzers decoding process [2—4]. In 
modern information warfare, it is essential for analyzers to detect and decrypt the 
opponent’s communications using information acquisition toolkits from real coding 
sequences [5]. 
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Information Warfare describes terms of “actions” executed to achieve a sought 
outcome—denial, exploitation, corruption and destruction of an opponent’s “infor- 
mation” and related functions, and prevention of such “actions” executed by an 
opponent [6]. 

The battle between the obscurers and those who sought to break the codes has been 
a continual one, but it reached a new level of stature and importance during World War 
II with its decryption of Germany’s Enigma messages. Historic events are approved 
that statistical and probability tools are extremely important in Information Warfare 
applications. This battle of wits fought by British mathematicians and statisticians 
shortened World War II and ushered in the age of information warfare [7]. 

Prerequisite of executing these attack actions is thoroughly understood by the 
mechanism of information encryption that opponent uses [8]. In information warfare, 
secured communications among opposite parts may use public networks. Itis feasible 
to capture relevant information for further analysis. Different quantitative tools and 
methods are useful to provide additional information in decoding process. Variant fea- 
tures play an important role for measurement and analysis of random sequences [9]. 

Because of the implicated expression of functions that generate random sequences, 
it is hard to get the characteristic of random sequences from the function and coding 
sequences themselves [10]. Traditionally, time sequence map and Poincare map are 
the two most popular methods to take the measure features of a random sequence 
in two dimensions [11]. From a visual viewpoint, current Markov chain schemes do 
not provide efficient visual mechanism to display multiple measurement sequences 
from the spatial characteristic of complex random sequences. 

To extract further information from random sequences, this chapter establishes a 
visual system to illustrate multiparameter measurement sequences of Markov chains 
as conjugate maps. For a given set of measurement sequences, the conjugate map 
proposed in this chapter can provide refined information of distributed structure than 
present map technologies [12]. 

In the second section, respective characteristics of traditional methods and con- 
jugate method are discussed. The measurement mechanism of logical function’s 
spatial characteristics, disposal model, measuring model, and visualizing model, is 
described in the third section. The results of maps and analysis of the results are 
discussed in the fourth and fifth sections, and then, concluding remarks are provided 
in the last section. 


2 Traditional Methods and Conjugate Method 


In this section, two typically traditional methods, time sequence map and Poincare 
map, are discussed for comparison. 

Time sequence map generates a 2D coordinate; X-axis is determined by the time 
scale 7, and Y-axis is determined by the value of measured parameter f(t), as shown 
in Fig. la. 
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f(t) 


f(t+1) 
25 


20 


Fig. 1 Simple time sequence map and Poincare map; a Time sequence map, b Poincare map 


The measure sequence { f Ol with length of T can form Poincare map accord- 
ing to the matching pattern considering data correlation. Poincare method maps one 
group of measures of time sequence to a 2D map. It detects spatial distribution of 
sequence through the distribution of point cluster. In Poincare map, X-axis is deter- 
mined by the value of f(t) while Y is f(t + 1). It is vicinity-related patterns map 
when / = 1, as shown in Fig. 1b. 

Different from Poincare method based on one group of measures, new map pro- 
posed in this chapter chooses two groups of measures from relevant parallel measures 
sequences. As two different groups of measures are acted simultaneously, the value 
of each axis is determined by these two groups of measurements. It is convenient 
to name new map as conjugate map to present this kind of multiple parameter mea- 
surement map. 


3 Generate and Measure Mechanism of Time Sequence 


In this section, the Cellular Automata (CA) method is applied to generate time 
sequence and then to make concomitant measurement sequence. First, the initial 
sequence inputted, and the output sequence is generated by a given logical function 
using 1D cellular automata. Using this data sequence, measurements are formed by 
probability measurement according to pairs of input and output sequences. Finally, 
the generated measure sequences can be used to construct a 2D conjugate map 
showing 2D spatial distribution of the time sequence. The processing flow of the 
mechanism is shown in Fig. 2. 
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Input Output Probability Measure 
Visualization | map | 
Sequence F unction f | Sequence — Measuring Sequence 


Fig. 2 Flow sheet of the produce and detect mechanism of time sequences 


Table 1 I/O pattern of disposal model 
Function f | Input sequence Xo, X1,...,Xi,...Xy_1, Xi € {0, 1} 


Output sequence Yo, Y1,..-,¥i,...¥y_y, Yi € {0, 1} 


Table 2 Exhaustion of initial 


; Serial number Input sequences 
input sequences 
0 000...000 
1 000...001 
2N 2 111...110 
2N —1 111...111 


3.1 Disposal Model 


Consider a logical function f as a function of CA. The function generates equal- 
length output sequence {Y; ole F for any initial input sequence {X; RE Pa with N-length 
bits. The I/O pattern is shown in Table 1. 

A total of 2” states of N-length initial input sequence are exhaustively generated, 
and the corresponding sequence under the logical function f : X — Y can be 
generated. The input and the output sequences are in the same group corresponded 
to each other; there are 2" groups of corresponding relationship [13]. Exhaustion of 
all the initial input sequences is shown in Table 2. 


3.2 Measure Model 


The basic model of measurement can be confirmed to establish the transformation 
relation between the input sequence {X Fao and the output sequence TANT for 
each group. 

In the transformation of f : X; —> Y;, 0 < i < N, there are a total of four types of 
transformations, each type determines a number, and corresponding relationships are 
shown in Table 3. This type of measurement structure has a directly corresponding 
relationship to the Markov chain mechanism [4]. 
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Table 3 Measure parameters 


Transform type | Number of types | Number of 0, 1 in input Total number 
sequences 

N=Not+M, 

0-0 Noo No = Noo + Noi 
= Noo + Noi + Nio + N11 

0> 1 Not 
1-0 Nio Ni = Nio + Nii 
1>1 Nit 


Table 4 Probability measure Measure parameters Value of parameter 
Poo(j) Noo(i)/No() 
Poi) Noi(i)/No) 
Piol) Niolj)/ NiC) 
Pu(j) NaCN C) 
Consider j € {0,1,2,..., 2N — 1} as the serial number of different initial input 


sequences. There are four measurements that can be identified by the measurement 
parameters above shown in Table 4 with Markov chain properties, respectively. 
For different initial input sequences, there can be eee ae four Prip e 


measurements on the corresponding I/O sequences: {Poo( DÉ Ag » {Poi( DE = ; 
[PoR and {Puga 


3.3 Visualization Model 


Based on the probability measurements presented above, two measurements are 
chosen to construct 2D map, as two different groups of measurements are used 
simultaneously, to name this kind of map conjugate map, of which the value of each 
axis is determined by these two groups of measurements. 

According to the construction pattern introduced above, there are C í = 6 kinds of 
different combinations as below: {Po (j), Por (J)}{ Pood), Pio()}, (PoC), Pur O), 
{Pio(J), Pu GDh, (Por), PCi), and {Poi C), Pio(j)}. 

On the same group of sequences, construct 2D conjugate maps, respectively, by 
using the combinations above as shown in Fig. 3. 

This chapter chooses the typical combination {Po:(j), Pio(j)} constructing 2D 
conjugate map to detect the special distribution of time sequences for N = 13 con- 
dition. 
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2 po-o 


20 4% æ 8 1% 


Fig. 3 2D conjugate maps constructed by separate six pairs of measures of No. 6 function; N = 13 


4 Visualization Result 


Because of the restriction of the structural complexity of the logical function, 16 
functions of 2 variables are used to describe them in the way of exhaustion [14]. 
Output sequences are generated by different initial input sequences under the given 
logical function and then obtaining various measure data from the corresponding 
T/O sequence based on probability method. Then, the map is constructed using these 
measurement data. 

This chapter chooses No. 1, 5, 6, and 13 functions which are typical functions as 
an example, observing the characteristic of three kinds of maps which are given in 
Fig. 4. 


In (a) group of time sequence maps, only one measurement sequence transforms 
with time. 


In (b) group of Poincare maps, different functions form different point clusters. 


In (c) group of conjugate maps, the distribution of the points cluster has clear polar- 
ized properties. 


According to the variable-value logic theory, three kinds of encoding model can 
be distinguished: W, F, and C [15]. 

The visualization information that can be acquired from a single function’s map is 
rather limited. In order to compare the spatial property of different logical functions, a 
4 x 4 array is constructed using the maps that are generated from 16 logical functions 
in different encoding patterns as shown in Fig. 5. 
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Fig. 4 Time sequence maps, Poincare maps, and 2D conjugate maps. a Time sequence map; b 
Poincare map; ¢ 2D conjugate map 


By assemble maps of total 16 logical functions under the models, the entire struc- 
ture information among logical functions themselves can be observed. 

To compare conveniently, combinations of 16 recursive images which generated 
from 16 functions are given in this chapter under different codes. Recursive images 
in W-code, F-code, and C-code from a given initial sequence are shown in Figs. 6, 
7, and 8, respectively. 
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Fig. 5 Assemble pattern of maps in W-code, F-code, and C-code 


LAAN > 


Fig. 6 Recursive images in W-code 


The combination of time sequence map is shown in Fig. 9. The figure shows that 
different functions have different distribution properties, and also reveals the trend 
of single measurement’s transforming with time. 

The combination of Poincare map in W-code is shown in Fig. 10. Dif- 
ferent distribution properties of functions can be observed from the figure. It 
is clear that there are four groups of configurations appeared in the figure: 
{0, 8, 2, 10}, {1,3,9, 11}, {4, 6, 12, 14}, {5, 7, 13, 15}. 
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Fig. 7 Recursive images in F-code 


For W-code, Poincare maps are shown in Fig. 10 and corresponding 2D conjugate 
maps are shown in Fig. 11. Conjugate maps have polarized properties, and their 
function pairs of 0:15, 1:7, 2:11, 4:13 and 8:14 have conjugate symmetry. In general, 
16 conjugate maps are different from relevant maps generated by Poincare maps. 

To arrange 16 Poincare maps and conjugate maps by F-code structure, F-code 
maps are shown in Figs. 12 and 13, respectively. 

Under C-code structure, Poincare maps and conjugate maps are shown in Figs. 14 
and 15. 

In the above maps, 2D conjugate maps not only show spatial distributions of 
different logical functions but also have special holistic symmetries under the F- and 
C-code conditions. 


5 Analyze 


Through three types of different maps, three different coding schemes can be 
observed. 
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Fig. 8 Recursive images in C-code 


Time sequence map can show the simple trend of single measurement series with 
time variations, but it was difficult for the scheme to describe spatial distributions of 
time sequence. 

Poincare map can apply a single measurement sequence; although the map can 
be generated under different lengths in a correlation, information of distribution is 
naturally limited by the selected measurement sequence. 

A 2D conjugate map uses two groups of independent measurements simulta- 
neously; this scheme can show differences and connections between spatial distri- 
butions of logical functions; furthermore, through different coding models, it can 
illustrate holistic relationships among different functions, i.e., function pairs of 0:15, 
1:7, 2:11, 4:13, and 8:14 have clear conjugated symmetry in conjugate maps. In 
addition, for C-code condition, the points of four functions on each edge of maps 
are located on the same side of edge. For example, points clusters of (0, 4, 1, 5), (0, 
2, 8, 10), (10, 14, 11, 15), and (5, 7, 13, 15) functions are separately located on four 
sides of the 2D map space. 
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Fig. 9 Time sequence maps of 16 functions constructed by {t, Po_1(t)} sequences 
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Fig. 10 Poincare maps in W-code 
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Fig. 12 Poincare maps in F-code 
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Fig. 13 Conjugate maps in F-code 
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Fig. 14 Poincare maps in F-code 
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Fig. 15 Conjugate maps in C-code 
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6 Conclusion 


Refined property of various time sequences can be identified from 2D conjugate 
maps to illustrate multiple measurement sequences under Markov chain mechanism. 
Spatial property of time sequence plays an important role in the study of dynamic 
sequence’s behavior. The stable distribution under visualization method can help 
people understand relevant issues. 

In comparison with Poincare maps and conjugate maps, there are additional prop- 
erties in the complex dynamic sequences. Conjugate map method uses multiple 
parameters of Markov chains to make independent measurements simultaneously. 

Proposed technology can provide further structural information among multiple 
measurements, and refined relationship via spatial distributions can be established. It 
is possible for the scheme to use statistical and probability methodologies to enhance 
visual tools of Markov chain mechanisms to resolve real problems and requirements 
for modern information warfare and information security applications in near future. 
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Permutation and Complementary R) 
Algorithm to Generate Random ge 
Sequences for Binary Logic 


Jie Wan and Jeffrey Zheng 


Abstract Randomness number generation plays a key role in network, information 
security, and IT applications. In this chapter, a permutation and complementary 
algorithm is proposed to use vector complementary and permutation operations to 
extend n-variable logic function space from 27" functions to 2?" x 2”! configurations 
for variant logic framework. Each configuration contains 27" functions that can be 
shown in a 22" ' x 22” matrix. A set of visual results can be represented by their 
symmetric properties in W, F, and C codes, respectively, to provide the essential 
support on the variant logic framework. 


Keywords Logic function - Permutation and complementary - Variant logic 
Symmetric distribution + Random sequence 


1 Introduction 


Random numbers play an important role in many network protocols and encryption 
schemas on various network security applications [1], for example, digital signatures, 
authentication protocols, key generation for PKI, RSA/AES [2], nonce frustrate, 
and symmetric stream encryption. A better random number algorithm will enhance 
encryption schemas, to do other applications. To satisfy different requirements, the 
NIST has published a series of statistical tests as standards [3] to determine whether a 
random number generator is suitable for a cryptographic application. After using the 
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vector complementary and the permutation operations on binary logic, the variant 
logical framework extends the traditional Logic function space from 2”" functions 
to 2?" x 2”! configurations [4]. Under the new extension conditions, it is possible to 
use simple transformation to generate huge numbers of random sequences for future 
applications. 

Permutation and complementary algorithm is described in the chapter to express 
different random properties through a series of binary image sequences undertaking 
typical recursive operations. 


2 Method 


Cellular automata perform a natural way to generate random sequence. The principle 
of binary cellular automata [5, 6] can be explained by an example as follows: 

First, a sequence 001100 and a function f : {00 > 0,01 — 1,10 > 1,11 — 0} 
are selected. 

Second, the sequence can be decomposed from left to right. The last bit is com- 
posed to the first bit 


001100 > {00,01,11,10,00,00} 


Third, according to the decomposed sequences and the generating function, anew 
sequence 010100 can be generated, i.e., f : 001100 — 010100. 

Followed the algorithm, the space of the generation function can be extended 
further; large numbers of random sequences can be generated. This mechanism can 
increase the complexity of code breaking. 

In variant logic framework, the logic function space has been extended from 2?” 
to 27" x 2”! by the permutation and the complementary operations. In two variable 
functions of cellular automata, there are 16 generated functions, and the 16 functions 
can be described in a truth table (Fig. la) with 16 entries. 


2.1 Permutation Operation 


The bit string of states {00, 01, 10, 11} in generating function can be converted to 
decimal number {0, 1, 2,3}. An example in Fig. 1b is shown to permute 3210 to 
1320 of the table. 
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(a).The Truth Table of 3210 (b).The Permutation Table of 1320 
P Status P Status 
J a +2 1 0 K J 1 3 2 0 K 
11 10 O1 00 0l 11 10 00 
0 0 0 0 0 0 0 0 0 0 0 0 
1 0 0 0 1 1 1 0 0 0 1 1 
2 0 0 1 0 2 2 1 0 0 0 8 
3 0 0 1 1 3 3210 3 1 0 0 1 9 
4 0 1 0 0 4 rf 320 4 0 0 1 0 2 
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Fig. 1 Permutation example 


2.2 Complementary Operation 


In the complementary operation, the complementary vector o is applied to operate 
the truth table. 


It can be described as 


s_ Jyd=1 
~ | ¥,6=0 
In two-variable variant logic, o is a binary sequence of 4 bits in {0000, ..., 1111}. 


In the example, the original table is o = 1111 and shown in Fig. 2a given o = 1100 
in Table 2 which can be described as 132010 = 1!3!2°0°. Under such operation, 
the sequence values of state 1 and 3 columns are invariant. But the values of columns 
whose index is 0 and values of the permutation sequence in state 2 and 0 are changed 
to their revised values, respectively. 

After the complementary operation, Fig. 2a changes to Fig. 2b. 


2.3 Visualization 


For function f, once applied on the sequence 001100 to output 010100, then this 
function can be applied on the sequence 010100 to output 111100. For such binary 
sequence, select black for 1 and white for 0 to generate the visual patterns as follows 
(Fig. 3). 
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(a).The Permuatation Table of 1320411) 


Oo 

1 1 1 1 
J P Status K 

1 3 2 0 

01 11 10 00 
0 0 0 0 0 0 
1 0 0 0 1 1 
2 1 0 0 0 8 o= 1100 
3 1 00 1 9 —_—_ 
4 0 0 1 0 2 
13 0 1 1 1 7 
14 1 1 1 0 14 


15 1 1 1 1 15 
Fig.2 Complementary example 


Fig. 3 Visualize the random 
sequence 


2.4 Matrix Representation 
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(b).The Complementary Table of 13200100 


(or 

1 1 0 0 
J P Status K 

1 3 2 0 

ol 11 10 00 
0 0 0 1 1 3 
1 0 0 1 0 2 
2 1 0 1 1 11 
3 1 0 1 0O 10 
4 0 0 0 1 il 
13 0 1 0 4 
14 1 1 13 
15 1 1 0 12 

001100 008800 
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010100» mi fm [mjn 
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For example (Fig. 2b), the truth value of third function is 1010. It can be converted to 

a binary coordinate (10|10) distinguished by left two and right two bits, respectively. 

So the decimal coordinate is (2|2). Then Fig. 2b can be converted to Table 1. 
Under such conversion, the 2D matrix can be represented in Table 2. 


3 Algorithm and Properties 


3.1 Permutation and Complementary Algorithm 


Using permutation and complementary operations, an algorithm is extended to 
express the n-ary variant logic functional space. 
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Table 1 Coordinate map of 


pe ° = 
1 1 0 0 
J P Status 
1 3 2 0 
01 11 10 00 
0 0 0 1 1 (0, 3) 
1 0 0 1 0 (0, 2) 
2 1 0 1 1 (2, 3) 
3 1 0 1 0 (2, 2) 
4 0 0 0 1 (0, 1) 
13 0 1 0 0 (1,0) 
14 1 1 0 1 (3,1) 
15 1 1 0 0 (3,0) 


Table 2 2D matrix of the 1320100% 


Algorithm: Permutation and Complementary: 

Input: variable n 

Output: a set of truth table of P”, VP € S(2"), Vo € B7”. 

Method: 

Step 1. Initial T = {272"—1.-..---- 10} 

Step 2. Generate a permutation P for T 

Step 3. From o = 000...0 to 111...1 do vector complementary operation. 
Step 4. Any new permutation? 

Yes go to Step 2. 

Step 5. End 


where S (N) is a symmetry group with N member and BY is an M variable Boolean 
structure with 2M members. 
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Table 3 2D matrix for n-ary 


gn-l 
logic functions (0, 0) oe ate (0, 2 =) 
(1, 0) O es (1, 22"-! — 1) 

(278-1 2. ON. acs es (22-1 
2, 22n=1 _ 1) 

(22n-1 —1,0)|... a (227-1 = 
1, Q2n-1 ze 1) 

Table 4 The number of W, F, Code system No 


and C codes in 2-ary variant 
functional space 


3.2 Representation Scheme 


Every truth table has a 2D matrix to arrange visual results of random sequence. The 

(X, Y) is the coordinate to allocate each visual result. So for n-ary logic function 
. . n=l n—1 . 

space, the 2D matrix has a size of 2” x 2?" as shown in Table 3. 


3.3 W, F, and C 


Three coding schemes can be distinguished in the algorithm. 

W code [4] is a binary sequence of 2” bits. It separates into two parts, (J 1J 0), 
Each part has 2”7! bits. 

F code is a subset of W code, and it is a symmetry code. In F code, if the /th 
meta-state in J! is 1 or 0, the Ith meta-state in J? is the negative state. 

If a code is F code, the /th meta-state in J! has the same value. Besides, four 
corners of its matrix are included in {0, x, x, 1}; it is C code [4]. 

For example, (32|10)(1110|0100) is an element of W code. In the sequence, 1 
is not the negative sequence of 3, and the 0 is not also the negative sequence of 2. 
(32|01)(1110|0001) is an F code. It has the symmetry property. In the sequence, 0 is 
the negative sequence of 3 and 1 is the negative sequence of 2. (13|20)(0111|1000) 
is aC code. It has the symmetry property of F code, and four comers of 1320’s matrix 
are included in {0, x, x, 1}. 

The further definition of W, F, and C codes can be found in [4]. 

From the exhaustive of the binary variant function space, the number of W, F, and 
C codes in binary variant function space [7] is shown in Table 4. 
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4 Coding Simples 


W Code: 
Permutation sequence: 3210 
The value of 0:1011 


il 


Fig. 4 The 2D matrix diagram and the visual result of 3210!°!! 


F Code: 
Permutation sequence: 3201 
The value of o: 1111 


Fig. 5 The 2D matrix diagram and the visual result of 3201!!!! 


243 


244 J. Wan and J. Zheng 


C Code: 
Permutation sequence: 1320 
The value of o:1100 


Fig. 6 The 2D matrix diagram and the visual result of 1320!! 


5 Result Analysis 


In Fig. 4, W code is shown as a general code. Majority W code does not have apparent 
symmetry property. W code covers all the code spaces which are formed from binary 
input variable. These properties can be seen in Fig. 4. 

All the F codes have overall symmetry in 2D distribution. Obvious symmetry 
among functions in the 2D matrix can be observed in Fig. 5. 

Simple is shown in a C code in Fig. 6. It is a small set of F code with complete 
symmetry property. C code has the four-constant vertex property. The group of the 
four vertexes in C code are located by 0, 15, 10, and 5 functions, respectively. 

In the n-ary logical function permutation and complementary algorithm, the per- 
mutation is operated for 2”!; the complementary exhaustive needs 2?" operation for 
each permutation operation. A total of computational complexity of an n-ary variant 
logical function using permutation and complementary algorithm is O (2) x2”). 


6 Conclusion 


A permutation and complementary algorithm has been proposed for n-ary logical 
function, and sample results are visualized. The visual results of W, F, and C codes in 
the variant and invariant properties support the variant logic system through exper- 
imentation to use an algorithmic mechanism to generate a series of huge random 
number sequences. 
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3D Visual Method of Variant Logic R) 
Construction for Random Sequence giecik 


Huan Wang and Jeffrey Zheng 


Abstract As Internet security threats continue to evolve, in order to ensure 
information transmission security, various encrypts and decrypts have been used in 
channel coding and decoding of data communication. While cryptography requires 
a very high degree of apparent randomness, random sequences play an important 
role in cryptography. Both Cellular Automata (CA) and RC4 contain pseudorandom 
number generators and may have intrinsic properties, respectively. In this chapter, a 
3D visualization model 3DVM is proposed to display spatial characteristics of the 
random sequences from CA or RC4 keystream. Key components of this model and 
core mechanism are described. Every module and their I/O parameters are discussed, 
respectively. A serial of logic function of CA is selected as examples to compare with 
some RC4 keystreams to show their intrinsic properties in three-dimensional space. 
Visual results are briefly analyzed to explore their intrinsic properties including sim- 
ilarity and difference. The results provide support to explore the RC4 algorithm by 
using 3D dimensional visualization tools to organize its interactive properties as 
visual maps. 


Keywords Pseudorandom sequence - CA + Stream cipher - RC4 keystream 
3D maps 


1 Introduction 


Wireless Sensor Networks WSN and Wireless Networks WN are most popular and 
widely used types of network of this era. Because of the openness these types of 
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networks are not very much secure. To provide the security over the WSN and WN, 
algorithm used must be fast enough which can encrypt and decrypt data compar- 
atively in less amount of time to require less resource too. In this concern, Wi-Fi 
Protected Access WPA and Wired Equivalent Privacy WEP protocols are used as 
standard. These standards have adopted the RC4 stream cipher algorithm to secure 
the data over the WN environment. These standard adopted RC4 algorithms because 
RC4 algorithm gives speedy encryption and decryption of data, utilize less hardware 
resource during processing, and easy to implement [1, 2]. Presently, RC4 algorithm 
is not secure in many aspects. Lots of weaknesses and attacks have been detected by 
the cryptanalysis [3, 4]. 


1.1 The Weakness of RC4 


RC4 algorithm is a stream cipher under the symmetric ciphers algorithms. Typically, 
in a stream cipher, the keystream is the sequence which is combined digit by digit 
to the plaintext sequence for obtaining the ciphertext sequence. However, the data 
encryption is equivalent to a simple XOR with keystream. The keystream is generated 
by a finite state automaton called the keystream generator [5, 6]. The encryption can 
be broken if the plaintexts are encrypted using the same keystream. RC4 keystream 
generated by RC4 keystream generator is completely compromising the security of 
RC4. 

Because it is very hard to trace the characteristics of keystream generators, ran- 
dom characteristics of keystream can be investigated on spatial characteristics of the 
keystream generator to test pseudorandom sequences. This chapter is the expansion 
work of [7] by Qingping Li from 2D to 3D. In this chapter, random sequences from 
given keystreams are collected in comparison with random sequences generated by 
sample logical function of 1D Cellular Automata to show their intrinsic properties 
in three-dimensional space of relationships. 


1.2 CA 


Cellular Automata is a great discovery in the twentieth century, and it forms a time 
series according to a given function in an iterations process by introducing logic 
function and related calculation methods in the natural pattern [8]. In 1985, S. Wol- 
fram formed the sequential cipher from pseudorandom sequence generated from 
logic calculation using cellular automata. Because of the implicated expression of 
the logic function, the spatial characteristic cannot be directly observed from the 
function formula [9]. 
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2 Architecture 


2.1 Architecture 


The architecture is shown in Fig. la. The three main components and their modules 
are shown in Fig. 2b-d, respectively. 

In the first part of this system, two types of data sets are generated by CACM 
and RC4KCM, respectively. The data sets on either CACM or RC4KCM get into 
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Fig. 2 Two sets of six 3D maps based on unified model in different conditions; al—a3 for the file 
CA; b1-b3 for the file RC4 
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the MM module as input data. The main function of the VM is to output the four 
vectors of variant measurements. Using unified or non-unified method, six probability 
measurements are created by PM module. In order to establish 3D maps, three vectors 
of probability measurements are selected from the six probability measurements by 
the SM module. Three vectors determine a 3D spatial position. All vectors generate 
a 3D map using 3DVM. 

There are six parameters in an input group, three sets of parameters in the inter- 
mediate group, and one set of parameters in the output group. 
Input Group: 


An integer indicates the serial number of logic function or the value of the key 
selected 

An integer indicates which model is selected 

An integer indicates the number of elements in the binary sequence 

An integer indicates the number of elements in a segment 

An integer indicates the method of selection mechanism 

An integer indicates the control parameter for mapping 


Intermediate Group: 


A 0-1 vector generated by CA logic function or RC4 keystream generator 
A set of four variant measures 
A set of six probability vectors 


Output Group: 
3D maps 


2.2 Computation Model of CA (CMCA) 


CMCA module is used to measure the features of a logic function based on Cellular 
Automata (CA). Consider a logic function f: Y= f (X) as a function of CA, the output 
sequence Y can be generated by the given initial input sequence X with 2 states. For 
N bits initial input sequence, a total of 2” states are generated under the logic function 
f: X— Y. A pair of vectors (X, Y) could be collected for their correspondences on 
the pair of input—output relationships. There are 2” groups of this corresponding 
relationship. 

Input Group: 


X A0-1 vector with N elements, X € BY 
n An integer indicating a 0-1 vector with n elements, 
f A function with 2 variables 


Intermediate Group: 


Y AO-1 vector with N elements, Y € BY 
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Output Group: 


VY Exhaustive set of all states of N bit vectors with 2” elements 


2.3 Computation Model of RC4 Keystream (RC4KCM) 


For an L bits input keystream K, divided into G segments and W = L/G bits of each 
segment with G<L. The value of parameter G determines the amount of points and 
W determines the spatial distribution for the output keystream in the phase space. 
Input Group: 


A 0-1 vector with L elements generated by RC4 keystream generator 


L An integer indicates the number of elements in an input sequence, 
G An integer indicates the number of segments divided, 
W An integer indicates the number of elements in a segment. 


Output Group: 
G sets of W bits 0-1 vectors 


The CMRC4 component uses an input vector as input, under different segment 
strategies to divide into several segments. The output of this component is G sets of 
W bits 0-1 vectors. 


2.4 Measure Mechanism (MM) 


The MM component shown in Fig. 1c is composed of three modules: Variant Mea- 
sure (VM), Probability Measurement (PM), and Selection Mechanism (SM). Three 
parameters are listed as input signals; four variant measures are outputted from VM 
module, six probability measurements are created from variant measures by Proba- 
bility Measurement (PM), under the Selection Mechanism (SM) module, and a set 
of triples interactive projections is selected. 

Input Group: 


V A symbol is selected from four types of transformations {L, +, —, T}, 
N An integer indicates the number of elements in an input vector 


A 0-1 data vector 
Intermediate Group: 


VM (RY A set of four variant measures 
PM ( PY) A set of four probability vectors 
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Output Group: 


UCV A set of three interactive projections under the SM condition, U C V 
PM(PY) A set of three probability vectors 


2.5 Variant Measure (VM) 


Considering the transformation of every bit between input sequence {X AAi Pi and 
output sequence {Yi} me there are a total of four types of transformations: 0— 0, 
0— 1, 1 — 0, and 1 — 1 [10, 11]. 

Define the variant representation as follows: 


L, X; = 0, Y; = 0; 
+, X; =0,Y%;=1; O<i<N, X; Yicb 
—,X;=1,Y; =0; 
TX; =1,Y; = l; 


For any N bit 0-1 vector X, X = XoX,...X;...Xn-1Xyv,0 <i < N,Xji € 
By, X; € BY under 2-variable function f, N bit 0-1 output vector Y,Y = 
YoY,...Y;...¥n_-1¥n,0<i<N,Y; € Bo, Y; € By. Let A be the variant measure 
function. 

N-1 
A(X > Y) = DUA > Vi) = (Ri, Re, R-, Rr), N= Ri + R, + R_ + Rr, Ro 
i=0 
= Ri + R, Ri = R_+Rr 


Example 
N =13, Y=f (X). 


X = 1001011100101 
Y = 0010110101100 
A(X > Y)=-l+—-+T7-TL+T 
(Ri + R, + R_, Rr) = (3,3, 4,3), Ro = 6, R =7,N = 13 


Input and output pairs are 0-1 variables for only four combinations. For any given 
function, the quantitative relationship of {L, +, —, T } is directly derived from the 
input/output sequences. Four meta measures are determined [12]. 

Input Group: 


V A symbolis selected from four types of transformations {L, +, —, T}, 
N An integer indicates the number of elements in an input vector 


A 0-1 data vector 
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Output Group: 


VM (RY) A set of four variant measures 
Ro An integer indicates the number of 0 in an input vector 
Rı An integer indicates the number of 1 in an input vector 


2.6 Probability Measurement (PM) 


Variant measure parameters and the other three parameters are listed as input signals; 
the output of probability signals is calculated as eight measurements in two groups 
by following the given equations. 

The first group of probability signal vectors p is called a non-unified model and 
defined as follows: 


p= ® =R}, R, R, R} 2 py = & 
Pa = Ža € {L, +, T} p=% 


The second group of probability signal vectors ð is called a unified model and 
defined as follows: 


~ Vv 
B= pam = Ri, Re, R, Ry r 
$ po=* 
Pa = Re E {L, +} & R 
1 
PSN 


R 
bp = R PELT 


Under such condition, the output signals of the PM module can be expressed as 
a pair of probability vectors in quaternion forms P M (P v) = {p, p}. 
Input Group: 


V A symbol is selected from four types of transformations {L, +, —, T}, 
N An integer indicates the number of elements in an input vector 

VM (RY) A set of four variant measures 

Ro An integer indicates the number of 0 in an input vector 

Rı An integer indicates the number of 1 in an input vector 

Output Group: 


PM (PY) A set of four probability vectors 
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2.7 Selection Mechanism Module 


The SM Module is composed of two models: Non-unified Model and Unified Model. 
Under different constructions, two models are established respectively as follows. 


Non-unified Model 


Selecting two measurements from four combinations {(,, 0,, 0, Pr}, there will be 
C choices. And then selecting one measurement from two combinations {p0, p1}, 
there will be C} choices. A 3-tuple S is defined as follows: 


S = (Pas Pp, Py) 
S = (Pp, Pas Py): Qa, BEV, y € {0,1}, a+ 
S=S' 


Unified Model 


Selecting two measurements from four combinations {(,, 0, 0, Pr}, there will be 
Cc choices. And then selecting one measurement from two combinations {p0, p1}, 
there will be Cj choices. A 3-tuple S is defined as follows: 


Š = (Õu, Bp, by) 
Š = (Bp. Õu by), % BEV, YEON, a AB 
g=% 


Under such condition, the output signals of the SM module can be expressed as 
a 3D visual model in 3-tuples forms S or 5. Specifically py or Py determines the 
value of X-axis, pg or Og determines the value of Y-axis, and p, or p, determines 
the value of Z-axis. 
Input Group: 


PM (PY) A set of four probability vectors 
Output Group: 


UCV A set of three interactive projections under the SM condition, U C V 
PM (PY) A set of three probability vectors 


2.8 Visualization Model 


Using a visual model, all possible measurements are calculated exhaustively on 
all G-1 vectors. Each 3-tuple can be drawn as a point in three-dimensional space 
(xyz-space). All G-1 points are constructed in the phase space for the selected keys. 
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3 Sample Results on 3D Maps 


In this section, two types of data sets are selected to illustrate their differences on 
3D maps for comparison. The first type of data sets is generated by CA. The second 
type of data sets is generated by RC4. 


3.1 Visualization Results of Unified Model 


See Fig. 2. 


3.2 Visualization Results of Non-unified Model 


See Fig. 3. 


3.3 Visualization Results of CA with Different Length 
of Initial Sequence 


See Fig. 4. 


3.4 Visualization Results of RC4 Keystream with Different 
Segment Strategies 


See Fig. 5. 


4 Analysis of Results 


The above 27 3D maps contain different information. Some important conclusions 
will be discussed in detail in this section. 

The first group of results shown in Fig. 2 presents two sets of six 3D maps 
constructed by the unified model from two data files: CA and RC4 to illustrate 
their 3D spatial characteristics. Three 3D maps of each group in Fig. 2al—a3 show 
3D spatial characteristics of CA with different logic functions. In this group, No. 
23, 90, 253 functions are selected as examples to compare each other. And three 3D 
maps of each group in Fig. 2b1—b3 show 3D spatial characteristics of RC4 with 20 
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X 
(al) f=23 (bl) k=12 


(a3) f=253 (b3) k=155 


Fig. 3 Two sets of six 3D maps based on non-unified model in different conditions; al—a3 for the 
file CA; b1—b3 for the file RC4 
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n=12: (al) (b1) (c1) 
x g Z : 7 x 
n=13: (a2) (b2) (c2) 


Fig. 4 Three sets of nine 3D maps under different conditions; al—a2 for the logic function f = 15 
and non-unified model; b1—b2 for the logic function f = 100 and non-unified model; ¢1—c2 for the 
logic function f = 170 and non-unified model 


bits of every segment and different given keys. In this group, keys: 12, 88, and 155 are 
selected as examples to compare each other. From a distribution viewpoint, different 
logic function can be distinguished by their three-dimensional spatial characteristics 
from CA files, e.g., (al—a3). Different from CA, for RC4 keystream, all spatial 
distributions are always in a plane, e.g., (b1—b3). 

The second group of results shown in Fig. 3 presents two sets of six 3D maps 
constructed by non-unified model. It is interesting to observe that all maps (no mater 
CA data files or RC4 keystream data files) have planar distribution, e.g., (al—a3) and 
(b1-b3). 

The third group of results shown in Fig. 4 presents three sets of six 3D maps 
constructed by non-unified model from CA data files with different lengths of the 
initial sequence and given logic functions. Figure 4al—a2 shows 3D maps for the No. 
15 function, (b1—b2) shows 3D maps for the No. 100 function, and (cl—c2) shows 
3D maps for the No. 170 function. The overall relationship of multiple-variable 
logic functions for spatial characteristics can be shown clearly. For example, under 
the non-unified model, no matter what logic functions are, all spatial distributions 
are always in a plane, e.g., (al—a2), (b1—b2), and (cl—c2). Different lengths of initial 
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W=20: (al) (bl) (cl) 


W=128: (a2) (b2) (c2) 


x 
W=256: (a3) (b3) (c3) 


Fig. 5 Three sets of nine 3D maps under different conditions; al—a3 for the key = 90 and unified 
model; b1—b3 for the key =90 and non-unified model; ¢1-c3 for the key = 123 and non-unified 
model 


sequence (n = 12, 13) have different spatial characteristics distribution with the same 
given logic function, e.g., (al—a2), (b1—b2) and (cl-c2). 

The fourth group of results shown in Fig. 5 presents three sets of nine 3D maps 
for the different conditions including segments strategies and keys. In this group, 
three types of segment strategies (W =20, 128, 256) are proposed to compare. 
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Combinations of three set use the same key e.g., (al—a3), (b1—-b3), and (cl—c3) to 
observe them conveniently. The dispersity of points increased with reducing the bit 
length of each segment. Obviously, the spatial distribution of points with 256 bits 
of each segment is more concentrated than the distribution of points with 20 bits, as 
shown in (al—a2), (b1—b2), and (cl-c2). 3D map shows some commonalities of the 
spatial distribution of different keys and different segment strategies. First, under this 
construction, different keys can be distinguished by their three-dimensional spatial 
characteristics in the model, e.g., (bl—c1), (b2—c2), and (b3-c3). Second, no matter 
what keys or segment strategies are, all spatial distributions are always in a plane. 
Third, the distribution features are varying from key to key and segment strategy to 
segment strategy. 


5 Conclusions 


Both the similarities and the differences may indicate those maps with comparable 
mechanism to express keystream with different given keys and in their high levels 
of relationships applying to the stream cipher mechanism. The spatial property of 
random sequence can be detected from the distribution of cluster point in the 3D 
maps discussed in details. Different spatial distributions are illustrated to show var- 
ious distributions on each phase space for relevant logic function or keystream. For 
example, no matter what keys or segment strategies are, all spatial distributions are 
always in a pane. And all maps (no mater CA data files or RC4 keystream data files) 
are planar distribution under non-unified model. Spatial distribution properties like 
this provide useful information for further exploring the RC4 stream cipher. This con- 
struction could provide remarkable insights to spatial information on stream cipher 
construction via 3D maps. Further explorations are required on this scheme. 
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Part VI 
Applications—Quantum Simulations 


The best way to understanding is a few good examples. 


—Isaac Newton 


The true logic of this world is in the calculus of probabilities. 
—James Clerk Maxwell 


A deep truth is a truth so deep that not only is it true but it’s exact 
opposite is also true. 


—Niels Bohr 


In the direction of quantum information, several papers were published in the period 
of 2011-2013. For example, Variant simulation system using quaternion structures, 
Journal of Modern Optics 59(5):484-492, 2012, “Chapter Interactive Maps on 
Variant Phase Spaces”, Emerging Applications of Cellular Automata, https://doi. 
org/10.5772/51635, In Tech Press 2013. In the Afshar experiment, variant scheme 
has been cited, https://en.wikipedia.org/wiki/Afshar_experiment. 

This part of quantum simulation is composed of two chapters (16 and 17). 

Chapter “Synchronous Property—Key Fact on Quantum Interferences” describes 
synchronous property in quantum interferences simulation on double path 
experiment. 

Chapter “The nth Root of NOT Operators of Quantum Computers” proposes a 
typical operator on the nth root of NOT operators as an algebraic solution. 


Synchronous Property—Key Fact A) 
on Quantum Interferences geig 


Particle Simulation on Double Path Experiment 


Jeffrey Zheng 


Abstract Double-slit experiment plays a key role in Quantum Theory to distinct 
particle and wave interactions according to Feynman’s claims. In this chapter, dou- 
ble path model and variant logic principle are applied to establish a simulation system 
for exhaustive testing targets. Using Einstein quanta interaction, different measure 
quaternion structures are investigated. Under Symmetry/Anti-symmetry and Syn- 
chronous/Asynchronous interaction conditions, eight groups of statistical results are 
generated as eight histograms to show their distributions. From this set of simulation 
results, it can be recognized that the synchronous condition is the key fact to generate 
quantum wave interference patterns and, in addition, the asynchronous condition is 
the key fact to make classic particle distributions. Sample results are illustrated and 
explanations are discussed. 


Keywords Double path - Interaction - Probability - Statistics - Simulation 


1 Introduction 


Feynman explored quantum measurement puzzles deeply [1, 2] and emphasized: 
“The entire mystery of quantum mechanics is in the double-slit experiment.” This 
experiment directly illustrates both classical and quantum interactive results. Under 
single and double slit conditions, dual visual distributions are shown in particle and 
wave Statistical distributions linked to von Neumann’s measure theory [3]. 
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From the 1970s, piloted by CHSH [4], Aspect used experiments to test Bell 
inequalities [5-7]. After 40 years of development, many accurate experiments [8—10] 
have been performed successfully worldwide using Laser, NMRI, large molecular, 
quantum coding, and quantum communication approaches [5-8, 11—26]. 

In this chapter, a double path model is established using the Mach—Zehnder inter- 
ferometer. Different approaches of quantum measures: Einstein, CHSH, and Aspect 
are investigated by quaternion structures. Under multiple-variable logic functions 
and variant principle, logic functions can be transferred into variant logic expres- 
sion as variant measures. Under such conditions, a variant simulation model is pro- 
posed. A given logic function f can be represented as two meta-logic functions 
f+ and f_ to simulate single and double path conditions. N bits of input vectors 
are exhausted by 2” states for measured data, recursive data are organized into 
eight histograms. Results are determined by symmetry/anti-symmetry properties evi- 
dent in these histograms. Both results are obtained consistently from this model on 
synchronous/asynchronous conditions. Based on this set of simulation results, syn- 
chronous condition shows significant relationship linked to interference properties. 


2 Double Path Model and Their Measures 


2.1 Mach-Zehnder Interferometer Model 


The Mach-Zehnder interferometer is the most popular device [6, 20] to support 
Young’s double-slit experiment. 

In Fig. la, a double path interferometer is shown. An input signal X under control 
function f causes Laser LS to emit the output signal p under BP (Bi-polarized filter) 
operation output a pair of signals: o* and p~. Both signals are processed by SW 
output py and p}, and then IM to generate output signals IM(p}, pp). In Fig. 1b, a 
representation model has been described with the same signals being used. 


Pulsed Input is Light Source 
Control Function BP Bi-Prism 


H $ x 
: H f 
: MN: 7 p Photon Flow sw Switcher 
; “4 Rat p-p- Polarized Flow IM interactive Measure 
a i a IM(p." p-')  p.*, pt Switched Flow 
p+ IMip." pt) Results 


Fig. 1 Double path model a Mach-Zehnder double path model, b Description model 
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2.2 Emission and Absorption Measures of Quantum 
Interaction 


Einstein established a model to describe atomic interaction [27—30] with radiation in 
1916. For two-state systems, let a system have two energy states: the ground state £1 
and the excited state £2. Let N; and N> be the average numbers of atoms in the ground 
and excited states, respectively. The numbers of states are changed from an emission 


state E> to E, with a rate Wo, in the same time; the numbers of ground states are 


t 
determined by absorbed energies from E; to E> with a rate aM , respectively. Let 
Nj, be the number of atoms from E; to E, and N2; be the numbers from E; to Fj. 
In Einstein’s model, a measurement quaternion is (N;, N2, Ni2, N21). 
CHSH proposed spin measures testing Bell inequalities [4, 6]. They applied L —> 


+ and ||— — to establish a measurement quaternion 


(N++(a, b), Ns—(a, b), N_+(a, b), N-_(a, b)). 


Experimental testing of Bell inequalities was performed by Aspect [5] in 1982. 
Four parameters are measured: transmission rate N,, reflection rate N, , correspondent 
rate N,, and the total number No in w time period. This set of measures is a quaternion 
(Ni, N-, Ne, No). Among these, Ne is a new data type not in Einstein and CHSH 
methods, this parameter could be an extension of synchronous/asynchronous time- 
measurement. 


3 Simulation Systems 


3.1 Simulation Model 


Using variant principle described in the next subsections, a N bit 0-1 vector X and 
a given logic function f, all N bit vectors are exhausted, variant measures gener- 
ate two groups of histograms. This variant simulation system is composed of three 
components: Pre-process, Interaction, and Post-process, respectively, and shown in 
Fig. 2. 

In Fig. 2a, three components of the variant simulation model are presented. At 
the pre-process, a N bit 0-1 vector X and a function f feed in to output a signal 
p. After interactive component process, two groups of signals are the output: u 
for symmetry group and v for anti-symmetry group. In the post-process, all N bit 
vectors are processed by pre-processing and interactive components until all of the 
2% data set has been processed to transform symmetry and anti-symmetry signals into 
eight histograms: four for symmetry distributions and another four for anti-symmetry 
distributions. 
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Fig. 2 Variant simulation 


system; a Variant simulation 

DETER ees: — aa 
ystem; D interactive —>|Fre- [Pre-procoss | Interaction Pose races Pose races 

component Ay, | D} 


(a) Variant Simulation M 


(b) Interaction Component 


In Fig. 2b, only the interaction component is selected, input signal p processed 
by BP to generate two signals {o_, p+}. SW output triple signals {o_, 1 — p_, p+} 
though IM to generate two groups of signals u and v. 


3.2 Variant Principle 


The variant principle is based on n-variable logic functions [31-33]. For any n- 
variables, x = X,-1...X;...X0, 0 < i < n,x; € {0, 1} = Bo. Let a position j be 
the selected bitO < j < n, x; € B2 be the selected variable. Let output variable y 
and n-variable function f, y = f(x), y € By, x € BY. For all states of x, a set S(n) 


composed of the 2” states can be divided into two sets: si (n) and si (n). 
Sin) = {xlx; = 0, Vx € BI} 
Sin) = {x|x; = 1, Vx € BB} 


Sin) = { sion), si} 


For a given logic function f, there are input and output pair relationships to define 
four meta-logic functions { f1, f+, f-, fr}: 


ful) = [Fl € Sn), y =0 
fel) = [fx € SYM), y = 1 
-œ= [F0 € Sf), y =0 


fra)= TOL e Sin), y=1 
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Two logic canonical expressions: AND-OR form is selected by { f(x), fr(x)} 
as y= 1 items, and OR-AND form is selected from { f_(x), fi (x)} as y=0 items. 
Considering { fr (x), fi (x)}, x; = y items, they are invariant themselves. 

To select { f(x), f-(x)}; x; A y forming variant logic expression. Let f(x) = 
(f+lx|f-) be a variant logic expression. Any logic function can be expressed as 
a variant logic form. In (f,|x|f_) structure, f, selected 1 item in S{(n) as same 
as the AND-OR standard expression, and f_ selecting relevant parts as same as 
the OR-AND expression 0 items in S} (n). For a convenient understanding of variant 
representation, two-variable logic structures are illustrated for its 16 functions shown 
in Table 1. 

For example, checking two functions f =3 and f = 12: 


{f =3:= (0| 3), A =11:=(0| ¢), f- =2:= (¢ | 3)} 
{f = 12 := (2 | 1), f+ =14:=(2| ¢), f- =8:= ($ | 1)} 
3.3 Variant Measures 


Let A be variant measure function [23, 33]. 


A = (A1, Ay, A, Ar) 


Af (x) = (Arf), Arf), A- fœ), Ar FQ) 
= (Afi (x), Af), AF_@), Afr(x)) 


Ape 1, if f(x) = fa(x), a € {L, +, —, T} 
0, others 

For any given n-variable state there is one position in Af (x) to be | and other 
three positions are 0. 

For any N bit 0-1 vector X; X = Xy-1 ... XJ... X0 0< J <N,Xy€ fo, X € 
BY under n-variable function f, n bit 0-1 output vector Y, Y = f(X) = (filX|f_), 
Y = Yy... Yj... Y0, 0 <J < N, Y; € p2, Y € PX. 

For the Jth position, be x’ = [... Xy...] € 23 to form Yy = f(x’) = (flx f-) 
let N bit positions be cyclic linked. Variant measures of f(X) can be decomposed 


N-1 
A(X : Y) = Af(X) = D7 Af”) = (Ni, Na, N-, Nr) 
J=0 


as a quaternion (N,, N+, N_, Nr). 
For example, N = 10, given f, Y = f(X). 
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Table 1 Two variable logic functions and variable logic representation (n = 2, j =0) 


f fe 3 2 1 O| ge]? Žž OP fe 
No. S(2) 11 10 01 00) $2) | 11° 10 01° 00! | $(2) 
0 {o} 0 0 0 0 (jf 1 0 1 0 | 43, 
1 {o} o 0 0 J oe a ED 
2 fi 0 oO 1 0 (ajf 1 0 0 0 | 43) 
3 {0} 0 0 42 4 (fj) 1 9 0 1 | 43) 
4 {2} 0 1 0 0 (2 1 1 1 o [J31 
5 {2,0} 0 1 0 l (2,0| | 1 1 1 1 || 3a) 
6 fo} O 1 1 0 Q}] 1 1 0 0 | 43) 
7 210} | 0 1 1 1 (2,0| 1 1 0 1 | |3) 
8 B} 1 0 0 0 øo o 1 0 | jn 
9 6,0} 1 0 0 1 oo o 1 1 |} hy 
10 8,1} 1 0 1 0 wo 0 oO 0 | Ie) 
11 go | 1 0 1 1 (jf © 0 0 1 | |g) 
12 8,2} 1 1 0 @ (2| | 0 1 1 o | 1) 
13 620 |1 1 0 1 (2,0| | 0 1 1 1 | 1) 
14 {3,21} 1 1 1 0 (2| | 0 1 0 0 | |g) 
15 | 620} | a 1 1 1 (2,0| | 0 1 0 1 | |g) 
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x = 0 1 1 0 O Te ıı 1 0 
Y = 1 0 1 O 1 O 1 O 1 0 
AGE 5 e ae E SE SY > Gee E 


Af (X) = (N1, N4, N_, Nr) = (2,3,3,2), N = 10 


Input and output pairs are 0-1 variables on the four combinations. For any given 
function f, the quantitative relationship of {L, +, —, T } is determined directly from 
input/output sequences. 


3.4 Measurement Equations 


Using variant quaternion, signals are calculated by following equations. For any N 
bit 0-1 vector X, function f, under A measurement: Af (x) = (N1, N+, N_, Nr), 
N = Ni +N, + N_ + Npr Signal p is defined by 


p= rp P+» P-» PT) 
N L» V+, V- PT 
Na 
Pa = g OS be l, ae{l,+,—,T} 


Using {0,, o_}, a pair of signals {u, v} are formulated: 


u = (uo, Uy, U_, Uy) = {up} 


v = (vo, Vs, V, Vj) = {vp} 


B € {0,+, —, 1} 
uo = p- Ọ p+ 
vo = (1 — p-)/2@ (1 + p+)/2 
u+ = p+ 
v+ = (1 + p4)/2 
u_ = p_ 
v=(—p.)/2 


ui = p- + p+ 
vi = (1 — p- + p+)/2 
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where 0 < ug, vg < 1, B € {0, +, —, 1}, $: Asynchronous addition, +: Synchronous 
addition. 

Using {u, v} signals, each ug (vg) determines a fixed position in the relevant 
histogram to make vector X on a position. After complete 2" data sequences, eight 
symmetry/anti-symmetry histograms of {H(ug | Ð} ({ H(vp | PYL e {0,+,—, 1} 
are generated. 


4 Simulation Results 


The simulation provides a series of output results. In this section, two cases 
are selected: N = {12,13}, n = 2,j = 0, {f =3, f= 11, f- = 2}, and 
{f = 12, f+ = 14, f- = 8}. Corresponding to double path, right path, left path, 
symmetric and nonsymmetric conditions, respectively. For the convenience of com- 
parison, sample cases are shown in Fig. 3a—c. In Fig. 3a, representation patterns 
are illustrated. Figure 3b represents f =3 conditions and Fig. 3c represents f = 12 
conditions, respectively. Eight histograms of H(u,|f) = H(u_| f) are shown with 
results represented by symmetric meta-functions in four groups. 


5 Analysis of Results 


5.1 Visual Distributions 


In H(u,| f) = H(u_|f) conditions, { H(u;|f), H(vi| f)} have significant interfer- 
ence patterns different from other conditions. Output results are balanced. 


5.2 Particle Statistical Distributions 


For all symmetric or nonsymmetric cases under @ asynchronous addition operations, 
relevant values meet 0 < uo, Vvo, U_, U_, Us, Ve < 1. 


Checking {H(uolf),H(volf)} series, = {H(us|f), H(u-|f)} and 
{H(vs|f), H(v_| f)} satisfy the following equation: 


H(uo| f) = Hu_|f)+ Huf) 
A(vo| f) = H(v_|f) + Hof) 


The equation is true even N and n in different values. 
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N Left Path Right Path Double-Particle Double-Wave Conditions 

12 H(u,| f) H(u_| f) H(u | f) H(u |S) Symmetric Meta 
Distributions 

13 Hu | HIN | Aol) H |f) He DRU V) 

12 H(v, | f) H(v_| f) Hv |S) Ho | f) Anti-symmetric Meta 
Distributions 

B | HOLD | HOIN Him \P) HON. || ec eudaul a 


2.04 x2000 11 
2.04 x1000 11 


O 0.2 6.4 6.6 0.8 21.0 


2.0 4x100 


© 0.2 0.4 
2.0 x200 
3.0 
0.8 
e 
o.4 
2.04 21000 
3.0 
e.s 
s 
o.4 


(a) Statistical Histogram Patterns 


c.€ 0.8 3.06 


0.6 0.8 3.0 


0.6 0.8 3.0 


h 


6.4 0.6 0.8 1.0 


mh 


0.4 0.6 OS 2.9 


are 


6.4 6.6 0.8 1.0 


co 6.2 6.4 


6.6 6.8 3.6 


6.4 6.6 6.8 3.6 


ti sls n 


EN 


(b) N={12, 13}, f =3, Histograms of Symmetric Meta Distributions 


Fig. 3 Results of symmetric meta distributions 
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2. i 2. E 8 2. A 2.0}x1000 12 


0.2 6.4 0.6 © 6.2 0.4 0.6 0.8 3.0 o 0.2 0.4 6.6 6.8 2.0 © 6.2 0.6 0.6 0.8 3.0 
2.04 12000 z.c4x20 8 2.04 2000 2.0 42200 12 
2.5 
2.0 3.0 
o.s .5 0.8 
° c 
0.4 0.6 0.8 3.0 © 0.2 6.4 0.6 6.8 3.0 0.4 0.6 6.8 3.0 © 0.2 0.4 0.6 0.8 1.0 
2.04 x1000 2.04 x1000 8 2.0 4x100 12 
8 2.5 
3.0 3.6 
0.8 5 o.s 
c e 
C 6.2 0.4 0.6 0.8 3.0 © 0.2 0.4 0.6 0.8 32.0 © 0.2 0.4 6.6 0.8 32.0 © 0.2 0.4 6.6 0.8 32.0 
2. o 4x200 2.04 z200 8 2.0 4x200 12 
-5 3.5 
3.0 2.0 
o.s o.s 
° ° 
0.4 0.6 6.8 1.0 0.2 0.4 0.6 0.8 3.0 © 6.2 0.4 0.6 0.8 3.9 © 0.2 0.4 0.6 0.8 3.0 


(c) N={12, 13}, f =12, Histograms of Symmetric Meta Distributions 


Fig. 3 (continued) 


5.3. Wave Interference Patterns 


Different interference properties are observed clearly in H(u+| f) = H(u_|f) and 
A(v,|f) = HU — v_|f) conditions. Under + synchronous addition operations, rel- 
evant values meet 0 < u1, V1, U_, V_, U4, Ve < 1. 

Checking {H(u|f), H(vı|f)} distributions especially in Fig. 3b-c 
{u1, v1} cases extremely strong interferences appeared and compared with 
{H(u+| f), Hu_|f)} and { H(v4|f), H(v_|f)}, there are significant differences. 
Spectra in different cases illustrate wave interference properties. From listed 
histogram distributions, they are all satisfied 
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Huil f) 4 Hu_|f)+ Ausf) = Aol f) 
Hi| f) £ H0v_|f) + Hf) = Hof) 


Single and double peaks are shown in interference patterns as classical double-slit 
distributions. 


5.4 Quaternion Measures 


It is interesting to see the relationship between the variant quaternion and other 
measures. 
In the variant quaternion, Af (x) = (N1, Ny, N_, NT), N = Ni +Ny+N_+Nr7. 
In Einstein’s two-state system of interaction (N1, N2, N12, N21) allows the fol- 
lowing equations to be established: 


Ni= Ni +N} 
Ny = N_+Nr 
Nn = N} 
Na = N_ 
N=N +N 


From the equations, the measured pair { N21, N12 } has a 1-1 correspondence to 
{N_, Ny}. 


Selecting + > 1, — —> 0, CHSHs N+,(a, b) measures meet 


N,,.(a, b) > Nr 
N, —(a, b) > N_ 
N_ (a,b) > Ny 
N__(a,b) > Ny 


(Nas, N+, N+, N__) a (Nr, N_, N4, N2), 


Let N = Na + N- + N_, + N__, CHSH quaternion is a permutation of the 
variant quaternion. 


Aspect’s quaternion (N;, N,, Ne, No) have the following corresponding: 


N, > N_ 
N, > Ny 
No > N 


There is no parameter in the variant quaternion for the parameter N,. It indi- 
cates joined action numbers to distinguish single and double paths, corresponding to 
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{uo, Vo} and {u ;, vı } times. In an actual experiment, this parameter is significant. 
In a simulated system, the parameter is a control coefficient that separates two types 
of measured paths {uo, vo} and {u;, vı } in the integration of comparisons on real 
experiments. 


6 Conclusions 


Analyzing N bit 0-1 vector and its exhaustive sequences for variant measurement, 
this system simulates double path interference properties through different accurate 
distributions. Using this model, two groups of parameters {ug } and {vg } describe 
the left path, right path, double paths for particle, and double path for wave with 
distinguished symmetry and anti-symmetry properties. 

Only synchronous conditions, double path system provides wave-like interference 
patterns different from classical ones. 

Compared with the variant quaternion and other quaternion structures, it is helpful 
to understand possible properties of usages and limitations for variant simulation 
systems. 

The complexity of n-variable function space has a size of 27". Whole simulation 
complexity is determined by O(2”" x 2%) as ultra exponent productions. How to 
overcome the limitations imposed by such complexity and how best to compare and 
contrast such simulations with real-world experimentation will be key issues in future 
work. 


Acknowledgements Thanks to Mr. Colin W Campbell for making English edition, Mr. Jie Wan 
for generating the simulation data, and Mr. Qingping Li for making the statistical histograms. 
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The nth Root of NOT Operators A) 
of Quantum Computers giecik 


Jeffrey Zheng 


Abstract This chapter proposes a novel approach to resolve the nth root of NOT 
problem for quantum computers using (—1, 0, 1) permutation matrices. Only logic 
NOT and exchange operations are required. This result provides a complete solution 
to design and implement the nth root of NOT operators of quantum computers. 


Keywords Quantum simulator - Quantum computation + Square root of NOT 
n-th root of NOT - Permutation matrix - Quantum logic gate 


1 Introduction 


Feynman [1] first proposed ‘universal quantum simulator’ towards a true quantum 
computer. Since then, research and development activities of quantum computation 
and quantum computers have become the new frontal of next-generation computers 
for two decades [2, 3]. Classical quantum mechanics use complex number vectors in 
Hilbert space to represent quantum states [4]. Any complex number is composed of 
two parts: a real part and an imaginary part. The imaginary number i = ./—1 plays 
the essential role in the quantum mechanics construction. However, the mystery of 
the imaginary number causes severe difficulties for its manipulation, imagination and 
understanding [4—6]. Considering that modern computers are constructed by Boolean 
logic principles, how traditional logic structure is used to implement ./—1 has been 
puzzling and deeply entangled in quantum computing for at least two decades [7—10]. 
Nothing in the published literature has described a way to implement this untamed 
operator using traditional logic operations [2, 11, 12]. 
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1.1 The Square Root of NOT Problem 


Following traditional logic, negation corresponds to logic NOT (~>). Initiated by 
Feynman [1] and further developed by Deutsch [9, 13], this problem has been rep- 
resented as V~ ‘the Square Root of NOT’ as one of the most difficult issues in 
quantum computation especially in general quantum gates. They suggested resolv- 


ing 7 = equation using logic operations for the solution. Maglicki and Wang 


10 
[11] provided an example of how to resolve the problem this way. 


Let — operation reverse two quantum spin states |0} = (2) |1) = ( l ), 


--(63)(2)-()=" 
sw=(2)()=()=® 


To apply unitary rotational matrices, — operator can be expressed as 


Ga et e-t lf1+il—i 
OC JAN itt eir J 2\1—il+i 
In the equations, both e!” and i symbols are involved. From a representative view- 
point, equations are useless because the symbols i and v~ are both logic equivalent. 
The equations are in circular definitions. 
To explore how to use traditional logic implementing /—, it is necessary to 


analyse what has been established at the foundation levels of modern complex number 
construction. 


1.2 Complex Number in History 


The origin and development of complex number has a long and mysterious history 
[14-16]. In the nineteenth century, Gauss and Euler [15] made their foundation con- 
tributions to formally identifying imaginary parts as the most essential components 
to resolve solutions from nth algebraic equations. After their work, the imaginary 
number has been gradually accepted by mainstream mathematicians to be one of the 
most important parts of mathematics [15]. Hamilton established consistent opera- 
tions on complex number in 1837 [17]. He constructed a complex number a + bi as 
an ordered number pair (a, b). 

For example, let a + bi and c + di be two complex numbers. Four essential 
operations: {+, e, /} can be expressed as 
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(a,b) + (c,d) =(atc,b+d) 
(a, b) è (c, d) = (ac — bd, ad + bc) 
(a, b) “(= oa 


(c,d) \ ct +d?’ +a? 
Using ordered pair representation, complex number operations are firmly estab- 
lished on real number operations. No further mysterious characteristics of imaginary 


numbers remain in the equations because all operations are well defined in real 
number construction. 


2 Solution of the Square Root of NOT Problem 


If we apply an imaginary number to an ordered pair, we have 
i: (a,b) > (—b, a) 
When we do not restrict /— solution in {0, 1} field but extend the field to {—1, 


0, 1}. A permutation matrix can be constructed. 
Let 


10 10 -10 01 0-1 
I = i= L = Z = Zi= 4 
2 ea 2 (a 2 k 4 5 i) 2 ( o) 


Z2 : (a, b) > (—b, a) 


(b,a) = anf, :) 


Because Z provides the same result as the imaginary number when applied to 
the pair, it is necessary for us to explore Z> features in details. 
Two eigenvalues of Z2 can be determined from its determinant. 


V41=0, X =-—1, A=tV-1 


This corresponds to either (; °) or & r) as the solution. There are two 
—i i 


unitary matrices U,, U_ and two Hermite conjugate matrices U*, U* undertaken 
similarity transformation on Z2 to produce the two diagonal matrices: 
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ilf = & o) = (8 sje: 


Although three matrices belong to one matrix group under similarity transforma- 
tion, five matrices can be distinguished without any direct equality. 


ih + ilf Æ Z ilf # -ih 


To apply the five matrices twice separately, they all equal to — Jy. 


hy = & ) 6 z) ~ E a = 
ar 2G 2 alg Se 


~ 


and 


Therefore, the Z2 matrix is an equivalent form of the imaginary number under the 


transformation. 
For any ordered pair (a, b), 


(Z2)? : (a, b) > (—a, —b) 


2, pe) Z 


> (—a, —b) 


(Z2)? : (a, b) 
(Z? =—h 


Z2=V-h 


So, /— operation can be constructed originally from one-one correspondences 


from the Z> matrix. 
Let (x| be a quantum state, —(x| = (x|. For a non-zero element of Z2, two values 
=1 : (x| (| then a v~ operator is generated 


{—1, 1} of the elements map 
1: (x|> (| 


from a Z> operator. 
For an ordered state pair ((x|, (y|), 
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(xl, oD 3S GL OD 3 (LGD = Mal, OD 


Therefore, Zz is a homologous form of the = operator. 

Under this construction, the square root of NOT problem in quantum computation 
is solved entirely. Only two elementary operations are involved in the transformation: 
logic NOT operation and pair—state exchange, respectively. They can be implemented 
readily using traditional logic constructions. 


3 General Solution of the nth Root of NOT Operation 


In this part, a general solution of </— ‘the nth root of NOT’ for quantum computers 
is explored. 

Let J, denote a conjugate permutation matrix which contains n columns and n 
rows and each row (column) has one non-zero element. 


n= eht= Ss del = lis 
i=l i=l 


Let J, be a unit matrix, I; j; = 1,i = j; l j; =0,i A j,i, j € Un). 


’ Jij € {—1, 0, 1}, i, j = [1,7] 


100 0-10 00 -1 100 
For example, matrices | 0—10 ],]/100],]/0-1 0 |,.h=]{]010 
001 001 10 0 001 


are J, matrices. 
Let P, be a (0, 1)-permutation matrix in which each column (row) contains only 
one element, and P S(n) denote a permutation space containing all P, matrices. 
Let J S(n) denote a conjugate permutation space. 


Lemma Fora givenn, P S(n) contains a total number ofn! distinguishable matrices, 
that is, |P S(n)| = n!. 


Theorem For a given n, JS(n) contains a total number of 2"n! distinguishable 
matrices, |J S(n)| = 2”n!. 


Proof Each non-zero element of J,, has two values {—1, 1}, and n different elements 
have 2” selections. The n elements can select a total number of n! different posi- 
tions. Both symbol and position selections are independent, and each combination 
determines a J, matrix. So there are 2”n! distinguishable matrices. 


Corollary JS(n) is a matrix space that is 2” times larger than P S(n). 


Theorem A matrix group of simple rotation in J S(n) may contain 2n distinguishable 
matrices. 
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Proof Using a rotation matrix Z, € J S(n), 


0100...0 0 
0010...0 0 
0001...0 0 
a RR » Jii = 1,i € [l,a], Jn.) = —1 and a vector 
0 000...0 1 
—1000...0 0 


X=(123...n-1n). 


To apply 2n Z,, matrices sequentially to the vector X, the following 2n vectors are 
produced: 


n 1 2 3...n—2 n-1 n 
X Zn -n 1 2...n—3 n—2 A 
XZ; = | -1 -2 -3 n+2—n+1 -n 
Xz n 1 —2...-n+3—-n+2—n+1 
yl 2 3 4...n-1 n | 
n 
.y2n 
That is, 2n distinguishable matrices {zi} ; z9 = Ze = l, are included. 
j=l 
Jn zi 
Because of X —> —X —> X, there are Z” = —J, and Z2" = I,, that is, 
Z! =h. 


Theorem Fora Z, there are n eigenvalues {i;}"_,,4; = /—1,i € [1, n]. 


Proof 
à—l 0 0 0 
0 à -l 0 0 
[An — Zn| = ay ee wee | SAT 4E1=0. 
00 0...A-1 
10 0 .0 2X 
Therefore, Z, = /—In. 
1: => 
For non-zero values, Gl (l map Z, > Ta, 
—1: (x| (x| 
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n 
Theorem For any state vector X, x(/>) = >X. 


Proof 
x (1| (21 (3)... (al 
XV (al (1] (21... (n— 1 
xyz EBA a 
xa-x) Aee 


4 Conclusion 


Using (—1,0,1) permutation matrices as basic tools, the nth root of NOT operators 
for quantum computers can be constructed and implemented by the traditional logic 
structure. Considering that this problem has puzzled advanced research of quantum 
computer for 20 years, this solution can provide quantum computer designers to 
practically implement quantum computers using traditional logic. The details of this 
construction will investigate in other places and the relationships among conjugate 
logic, quantum logic, quantum gates and complex number structures will be explored 
for foundation of Quantum computers and quantum computation of future computers. 
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Part VII 
Applications—Binary Sequences 


Unity can only be manifested by the binary. 
Unity itself and the idea of Unity are already two. 
—Buddha 


Every axiomatic (abstract) theory admits, as is well known, an 
unlimited number of concrete interpretations besides those from 
which it was derived. 


Thus we find applications in fields of science which have no relation 
to the concepts of random event and of probability in the precise 
meaning of these words. 


—Andrey Kolmogorov 


At its most fundamental, information is a binary choice, in other 
words, 
a single bit of information is one yes-or-no choice. 


—James Cleick 


Various approaches of variant construction on binary sequences were developed 
from 2011 on cellular automata data sequences to construct 2D/3D maps. From 
2014, different binary sequences generated from stream ciphers have been exten- 
sively examined and combinatorial maps were developed. For example, Variant 
Pseudo-Random Number Generator, Hakin9 Extra, Issue 6, 2012 (13), 28-31. 
http://hakin9.org/hakin9-extra-62012/, Interactive Maps on Variant Phase Spaces in 
Emerging Application of Cellular Automata, InTech Press, 113—196, 2013. http:// 
dx.doi.org/10.5772/51635. 

Further results were published, e.g., Cryptographic Sequence on Variant Maps, 
ASONAM 2017: 1065-1071. https://doi.org/10.1145/3110025.3110152, and 
Stationary Randomness of Quantum Cryptographic Sequences on Variant Maps, the 
2017 IEEE/ACM International Conference, ASONAM 2017:1041—1048. https://doi. 
org/10.1145/3110025.3110151. 

This part of binary sequences is composed of five chapters (18—22). 
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Chapter “Novel Pseudorandom Number Generation Using Variant Logic 
Framework” proposes a novel PRNG using variant logic framework to apply mixed 
operations of permutation and complement in variant tables to generate random 
sequences under various control parameters. 

Chapter “RC4 Cryptographic Sequence on Variant Maps” uses binary sequences 
of RC4 stream cipher on 1DP and 2DP variant maps. Different characteristics of 
visual distributions can be observed. 

Chapter “Refined Stationary Randomness of Quantum Random Sequences on 
Variant Maps” checks three quantum random sequences { ANU, USTC, USTCo} 
stationary randomness, significant measuring differences identified. 

Chapter “Using Information Entropy to Measure Stationary Randomness of 
Quantum Random Sequences” uses information entropy to measure stationary 
randomness of quantum random sequences. Data streams from USTC are selected 
and their quantitative measurements are compared. 

Chapter “Visual Maps of Variant Combinations on Random Sequences” pro- 
poses visual maps of variant combinations on random sequences that provide a 
flexible framework to support various projections under complicated combinations. 
Typical maps are illustrated. 


Novel Pseudorandom Number R) 
Generation Using Variant Logic giecik 
Framework 


Jeffrey Zheng 


Abstract Cybersecurity requires cryptology for the basic protection. Among differ- 
ent ECRYPT technologies, stream cipher plays a central role in advanced network 
security applications; in addition, pseudorandom number generators are placed in 
the core position of the mechanism. In this chapter, a novel method of pseudoran- 
dom number generation is proposed to take advantage of the large functional space 
described using variant logic, a new framework for binary logic. Using permutation 
and complementary operations on classical truth table to form relevant variant table, 
numbers can be selected from table entries having pseudorandom properties. A sim- 
ple generation mechanism is described and shown, and pseudorandom sequences 
are analyzed for their cycle property and complexity. Applying this novel method, it 
can play a useful role in future applications for higher performance of cybersecurity 
environments. 


Keywords Pseudorandom number generation - Variant logic - Cryptology 


1 Introduction 


In advanced cyber environment, cybersecurity mechanism plays a guider role to pro- 
tect the secure information communicated and stored in network facilities [1, 2]. To 
achieve adequate network security effects, cryptology has to be placed in the essential 
position [1]. Different from block ciphers that operate with a fixed transformation on 
a large block of plaintext, stream ciphers operate with a time-varying transformation 
on individual plaintext digits. Under the stream cipher methodology, Pseudorandom 
Number Generator (PRNG) is placed in the central part of the mechanism. 
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From 2000 to 2003, New European Schemes for Signatures, Integrity, and Encryp- 
tion (NESSIE) were started [3]. During 2004—2008, another European stream cipher 
project: STREAM selected four software and three hardware schemes for ECRYPT 
stream ciphers [4]. Such extensive international activities on ECRYPT methodolo- 
gies are showing the ultra-importance of stream cipher technologies in cyber envi- 
ronments for wider security applications. 

From a cyber resilience viewpoint [5-7], a set of researchers focus attention 
on leakage-resilient pseudorandom generator. This direction has shown interesting 
results to protect valuable information from side-channel attack aspects. 

Since PRNG plays a key role in stream cipher applications and is the heart of 
cryptology [1, 8—10]. Many mathematical methodologies are applied to this field such 
as linear automata, cellular automata, Galois fields, and other algebraic constructions 
[1, 9, 11-14]. In cryptology, Boolean logic operations are essential to create highly 
effective cryptology systems [1, 9, 15, 16] as binary logic generates the greatest 
efficiency through manipulation of only 1’s and 0’s. Therefore, it is advantageous to 
investigate potential mechanisms in binary logic due to the follow-on effect it has in 


cryptology. 


2 Classical Logic Function Table 


A classic logic function in n variables can be represented as a truth table [8, 9]. For 
a classic sequence in an ordinary number sequence, each table contains 2” columns 
and 22" rows with a total of 2” - 2?" bits, respectively. An example of the standard 
truth table can be seen in Fig. la. 


N 2"—1 i 0 AP(2"-1) AP(i) AP(O) K 

0 On ma On ES EO APO) . APO) . APO) K 

J Teen es ie Eel as aay a AY APUy K, 
2-1 1 nf Ul Pao] AP((2%-1)zn.1) .. AP(22"=1)) .. AP(22"=1) 9) Kor 

(a) Truth Table Example (b) Variant Table Example 


Fig. 1 n variable truth table and variant table under P and A operators 
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3 Variant Logic Function Table 


Variant logic construction is a new proposed theoretical structure [17, 18] to extend 
classical logic from the three basic operators: {N, U, ~}. Two additional vector oper- 
ators: permutation P and complementary A are included with the original three to 
form the five basic operators within the novel framework. Let S(V) denote a permu- 
tation group with N elements, then S(N ) contains a total of N ! permutation operators. 
Let BX = {0, 1}” denote a binary group with N elements, then BA contains a total 
of 2 complementary operators. 

The permutation (P) and complementary (A) operators are two vector operators 
performed on each column vector of 2?" bits. For a given P and A, two operators 
transform the truth table into a variant table. Permutation operators change positions 
of relevant columns but do not change their values. Complementary operators (A) do 
not change the position for each column, but may change entire values of the column. 
Two given operators can be performed together to generate a variant table for further 
usages. There are 2” columns in the table as permutation elements, so this permutation 
group S(2”) contains a total of 2”! permutation operators, and its complementary 
group B? includes a total of 27" complementary operators. An example of the variant 
table can be seen in Fig. 1b. 


4 Variant Method of Pseudorandom Number Generation 


Input: n, P, A,m, L variables, n € N, P € S(2”), A, L,m € B7 

Output: {Km, Km+1, -- - -, Km+L-1}L + 2” bit sequences 

Method: The process for pseudorandom number generation can be seen in Fig. 2. 
n is the input variable number. Using n variables, a standard truth table can be 
constructed in 2” columns and 2?” rows. P is a given permutation operator P = 
(Pa... Pr... Po), P € S(2”), where P; corresponds to the /-th column. A given 
complementary operator A € BŽ”, A = (Anı... Az ... Ao), Az € Bo shows that 
the operator is performed on the /-th column, if A; = 0, all values of the column are 
reversed and if A; = 1, all values are invariant. 0 < m < 22" is an initial position 
for output sequences; from Km, L conditions, (Kanay o are output generated 0-1 
bit sequences. 


5 Sequence Generation Example 


For convenient understanding procedure, an example is selected to show in the n 
= 2 case shown in Fig. 3. Parameters are initialized to arbitrary values: n =2, P = 
(1203), and A = (0110). 
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After the table is generated, the pseudorandom sequence can read off the table. 
For m = 4 and L = 6 conditions, a random number starting at position 4 of the variant 
table containing six elements can be found. 


Sequence og. 
Generation 


Fig. 2. Variant method of random number generation 


Truth Table P=(1203) Permutation Table 4=(0110) Variant Table 


ð o ð o ô ô ô ' o o 1 9 
! ð ð ð ! ô o 1 ð 2 ð ! 1 n 
2 o o I o I o o o 8 o 9 o 1 1 
3 ð o I ' I o ! o 10 o ð 1 1 3 
4 9 ' o o 9 ' o o 4 I 1 o 1 LEJ 
5 0 I 0 ' o ! I o 6 ! 1 1 15 
6 9 I I o I I o o n2 o 1 0 1 5 
rot = 4 1 rou | 1 1 1 7 
5 ! 9 o o °o o o I I ' 0 ð ð 8 
9 ! ð ð ! o ð I I 5 ! ð ! 0 10 
19 ' o ' o ! o o I 9 ð ð 0 ð o 
11 1 o ! I I o I J "m ô ð ! 0 2 
n 1 ' o o o ' o ' 5 1 1 ð 0 12 
LE I ! o 1 o I ! I 7 1 ! ! 0 LEI 
[E] ' ' ' 9 I ! °o I u ò 1 ô ô 4 
1 | i 1 i ’ i i 


Fig. 3 Example for generation of pseudorandom sequence 


Novel Pseudorandom Number Generation Using Variant ... 293 


6 Complexity Analysis 


From an application viewpoint, it is important to have the exact complexity evaluation 
for the method. In the initial stage, it is necessary to manipulate 2” columns and each 
column with 22” rows; the total numbers of 2” - 22” bits are required. The total 
complexity is of order O(2” - 2"). 

To generate variant table values, P operations need at least to manipulate bits once 
and A operations to manipulate the same number of bits, i.e., O(2” - 2’). 

Selecting L - 2” bits from the variant table, it is necessary to perform O(L - 2”) 
operations. 

If a full table needs to be generated as a random resource, O(2” - 27") computa- 
tional complexity is required. In general, their computational complexity is O(L - 2”) 
— O72" .27”)0 < L < 2”. 

Maximal cycle length: under this construction, the maximal length of the pseu- 
dorandom number sequence is 2” - 2?" bits. For any short sequences, the output 
sequence has a length less than this number. No clear cycle effects can be directly 
observed. 


7 Conclusion 


It is important to design this new PRNG method to use variant logic construction. 
Since P and A potentially have a huge configuration space 2”! x2? times larger 
than classical logic function spaces. Exploring how difficulties for this mechanism 
to be decoded will be the main issue for coming cryptologist’s theoretical targets. In 
addition, it is important to understand what type of distribution will be relevant to this 
generation mechanism. Owing to intrinsic complexity of variant logic construction, 
this provides potential barriers to protect this type of sequences decoded directly. 

Considering PRNG placed in the central part of stream cipher mechanism, and 
stream cipher technologies are more and more important in advanced network secu- 
rity environment, higher performance methodology and relevant implementation will 
be useful in this field. Ongoing approaches will focus on whether this mechanism 
provides better PRNG methods to help different protections on side-channel attacks 
[1-7, 19, 20] in wider network applications to resolve practical leakage-resilient 
issues in the future. 
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RC4 Cryptographic Sequence on Variant | @ 
Maps ee 


Zhonghao Yang and Jeffrey Zheng 


Abstract In modern cyberspace environment, big data streams are the most 
important issue in people’s daily lives, each person produces a larger number of 
data streams every day from personal computer, cell phone, and kinds of wearable 
smart device. Security risks of storage and transmission of data streams may lead 
to personal privacy disclosure, it is important for network security to have useful 
tools facing challenges. Randomness testing provides useful tools to secure results 
of stream ciphers. Based on multiple statistical probability distributions, this chapter 
presents a visual scheme, variant maps, to measure a whole cryptographic sequence 
into multiple 1D and 2D maps. Mapping mechanism and sample cases are provided. 


Keywords Random sequence - Big data - Variant map 


1 Introduction 


In modern cyberspace environments, more than 2.5 EB data streams per day are 
generated from global network environments [1]. Huge network companies managed 
massive data streams in PB every day [2]. The development of artificial intelligence 
fields makes it easier to extract valuable information from big data [3-5]. Big data 
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and big data technology provide modern societies so much convenience to many 
places, and with several threats to network security [6, 7]. 

Stream ciphers are the most useful scheme to protect the security of data streams 
in both transmission and storage processes. Pseudorandom number sequences are 
generated by various algorithms based on recursive computational models, and true 
random number sequences are generated by different physical methods. The typical 
stream ciphers are RC4 and Salsa20. Stream ciphers can be built using block ciphers 
in OFB or CTR model. In this chapter, an RC4 stream cipher is selected to generate 
pseudorandom sequences for testing. 

From a testing viewpoint, randomness tests focus on three aspects: probability, 
autocorrelation, and unpredictability. NIST 800-22 provides a list of randomness 
testing method based on p-value [8]. 

In this chapter, two types of 1D and 2D statistical probability maps are used to 
visualize a longer pseudorandom number sequence generated from an RC4 stream 
cipher. 


2 Related Work 


Variant map is an emerging technology proposed in 2010s to handle multiple 0-1 vec- 
tors in phase spaces on variant framework [9-11]. Different applications are explored 
for variant maps on ECG data sequences [12], bat echolocation call sequences [13], 
gene sequence [14], and cryptographic sequences [15-17]. 


3 Mapping Model 


This chapter uses two mapping schemes on 1D and 2D statistical probability distribu- 
tions as variant maps for an input N-length 0-1 sequence. The architectural diagram 
of the mapping model is shown in Fig. 1. It is composed of three components: seg- 
mentation, measurement, and visualization. 


input: 0-1 sequence , ae output: variant maps 
——————_| segmentation >| Measurement >| Visualization ~> 


Fig. 1 Architecture of variant map for cryptographic sequence 


RC4 Cryptographic Sequence on Variant Maps 299 


Fig. 2. Measurement = pi 


3.1 Basic Symbol 


(1) S: an input 0-1 sequence, 

(2) si: the i-th segment of the input sequence, 
(3) N: length of the input sequence, 

(4) M: count of segments, 

(5) m: length of a segment, and 

(6) p: number of 1’s elements in the segment. 


3.2 Mapping Model 


Three components can be described as follows. 
e Segmentation 


Input data is a 0-1 sequence S of length N. It can be divided into M segments and 
each segment has m elements. 


S = {59,81,---5S8;,---,Sy_ 1}, 0<i<M 


e Measurement 
For each segment s; of S, the following analysis is performed to obtain the one 
feature p; of the segment, that is, the number of 1 of s;, and 0 < p < m. For 
example, for two segments sı = 00011 and s2 = 10110, and two measurements 
are pı = 2 and p = 3 (Fig. 2). 


Calculating all segments of S, a set of p measurements are determined. 
(Posies Pis ---, Peal = (Pio, O0<i<M 


e Visualization 


From the generated sequence of measurements, two types of diagrams can be created: 
The first one is a 1D map, 1DP sorted from { pi a directly shown in Fig. 3a. The 
second one is a 2D map, 2DP sorted from a pair of measurements {p;, Pino 
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Fig. 3 Two maps; a 1DP; b 2DP 


created from { phi shown in Fig. 3b. This mapping scheme is one of Markov 
chain models. 


4 Random Sequence Data Sources 


In this chapter, a pseudorandom generator is based on an AES block cipher on the 
OFB mode. A total amount of 120 MB cryptographic sequences has been generated. 


5 Mapping Results 


The input sequence is mapped with a list of various lengths on different segmenta- 
tions. Three sets of various m lengths are selected and two types of relevant 1DP 
and 2DP maps are shown in Fig. 4a—c, for (a) m = {8, 16, 32, 64, 128, 256}, (b) 
m = {80, 100, 120, 140, 160}, and (c)m = {126, 127, 128, 129, 130}. Four enlarged 
2DP maps are shown in Fig. 5 for m = {126, 127, 128, 129} and two enlarger 2DP 
maps are shown in Fig. 6 for m = {128, 130}, respectively. 


6 Result Analysis 


In Fig. 4, both 1DP and 2DP maps are illustrated. When the input sequence is larger 
enough to m x 2”, the results of IDP maps are corresponding to binomial distribu- 
tions. It is interesting to see significant changes when various lengths of segments 
are applied. 
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Fig. 4 1DP and 2DP maps. a m = {8, 16, 32, 64, 128, 256}; b m = {80, 100, 120, 140, 160}; ¢ 
m = {126, 127, 128, 129, 130}; d enlarged 1dp and 2dp, m = {126, 127, 128, 129, 130} 


For various 2DP maps in Figs. 4, 5, and 6, 2D distributions are represented as 
pseudocolor to illustrate relevant 3D structures. From smaller maps to enlarged maps, 
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(a) m=126 p_count_max=46271(p=63.0) (b) m=127 p_count_max=45719(p=63.0) 
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Fig. 5 Enlarger 1DP maps. a m = 126; b m = 127; c m = 128; d m =129 


many interesting features can be identified and significant symmetric or nonsymmet- 
ric properties could be identified. Enlarger maps can see further refined patterns in 
detail. 


7 Conclusion 


Mapping modelin this chapter is a focus on a single sequence for two types of 1DP and 
2DP maps. 1DP maps are corresponding to classical statistical maps and 2DP maps 
are represented as various Markov chains. Further researches and experiments are 
required to explore useful tools on cryptographic sequences in detail (Figs. 7 and 8). 
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Fig. 6 Enlarged 2DP maps. a m = 126; b m = 127; c m = 128; d m = 129 
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Fig. 7 Enlarger 1DP maps. a m = 128; b m = 130 
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Fig. 8 Enlarger 2DP maps. a m = 128; b m = 130 
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Refined Stationary Randomness A) 
of Quantum Random Sequences giec 
on Variant Maps 


Jeffrey Zheng, Yamin Luo and Zhefei Li 


Abstract In this chapter, a testing model is used to apply statistical probability in 
multiple distributions on three maps for a selected sequence to check refined sta- 
tionary randomness on quantum sequences. Three random data sequences are col- 
lected from two quantum random resources: one from Australian National University 
(ANU) and two (initial and secure) from University of Science and Technology of 
China (USTC). Multiple results are created on three maps, and measurements of 
stationary randomness are illustrated and compared. Three samples show distinct 
stationary properties. 


Keywords Variant maps > Quantum random sequence + Chaotic random sequence 
Ordered measures - Maximal; Stationary randomness 


1 Introduction 


In advanced social network environment, multimedia signal sequences of big data 
streams are composed of time series processes. Quantum experiments in quantum 
satellite using quantum key distribution (QKD) systems [1] is the most advanced ICT 


This work was supported by the Key Project on Electric Information and Next Generation IT Tech- 
nology of Yunnan (2018ZI002), NSF of China (61362014), Yunnan Advanced Overseas Scholar 
Project. 


J. Zheng (%3) 
Key Laboratory of Quantum Information of Yunnan, Yunnan University, Kunming, China 
e-mail: conjugatelogic @ yahoo.com 


J. Zheng 
Key Laboratory of Software Engineering of Yunnan, Yunnan University, Kunming, China 


Y. Luo - Z. Li 
Yunnan University, Kunming, China 
e-mail: 1047668416@qq.com 


Z. Li 
e-mail: 576167164@qq.com 


© The Author(s) 2019 307 
J. Zheng (ed.), Variant Construction from Theoretical Foundation to Applications, 
https://doi.org/10.1007/978-98 1 - 13-2282-2_20 


308 J. Zheng et al. 


development to establish ultra-secure quantum communications. For a QKD system, 
a truly random number generator [2] play a key role. From an analysis viewpoint, 
it is necessary to test stationary randomness in time variations. In this section, a list 
of relevant schemes: pseudo/truly random sequences, P_value, statistical probability 
distribution, optical statistics, stationary properties, and variant maps, are discussed. 


1.1 Pseudo/True Random Sequences 


1.1.1 Pseudorandom Sequences 


Traditional stream ciphers [3] on linear feedback shift register structure (LFSR) are 
used as pseudorandom number generators. The LFSR stream ciphers are the core in 
classical stream ciphers. 

The new generation of stream ciphers has being shifted from LFSR [3] to nonlin- 
ear modes: NLFSR, clock control [4] and nonlinear functions, etc. It is difficult to 
use nonlinear mathematical theories, recursive models, descriptive tools, and imple- 
menting schemes in nonlinear dynamic environments. 


1.1.2 True Random Sequences 


Differently from pseudorandom sequences generated by stream ciphers, high-quality 
stochastic oscillators of truly random sequences are generated from special hardware 
devices such as laser photonics [5], nonlinear optics, quantum optics [6], quantum 
noises, thermal noise, chaos, and fractal nonlinear dynamics [7]. 


1.2 Testing Schemes 


1.2.1 P_value Schemes 


Various statistic testing packages measure randomness properties on a given random 
sequence. NIST 800-22 package [8] is a typical representative to provide more than 
15 testing schemes. Using the package, it is essential to check whether P_value 
>0.01 for the sequence. Since such measuring scheme provides a static condition, 
it is difficult to use only P_value parameter to express complex dynamic behaviors 
involved in random sequences. 


1.2.2 Multiple Statistical Probability Distributions 


Measuring random sequences under segment conditions, multiple statistical proba- 
bility schemes are useful to create various distributions to illustrate complex spatial 
relationships. 
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Multivariate normal probability distributions are the most important and powerful 
tools to test stochastic characteristics of a random data sequence under the frame- 
work of probability, stochastic process and statistics [9] for nonlinear problems. In 
this kind of measuring models, when a data sequence is sufficiently long, the high 
dimensional probability distribution of the sequence [10] is converged to a contin- 
uous Gaussian distribution. Multivariate Gaussian probability distributions support 
various schemes to analyze complex stochastic data set of measuring sequences in 
continuous conditions. 


1.2.3 Photon Statistic in Quantum Optics 


Photon statistics is the theoretical and experimental approach on the statistical distri- 
butions in photon counting experiments to analyze the statistical nature of photons 
in a light source. 

Three types of distributions can be obtained by the light source [11]: Poissonian, 
super-Poissonian, and sub-Poissonian. The variance and average number of photon 
counts are identified for the corresponding distribution. Both Poissonian and super- 
Poissonian light are described by a semi-classical theory in which the light source is 
modeled as an electromagnetic wave and the atom is modeled by quantum mechanics. 
In contrast, sub-Poissonian light requires the quantization of the electromagnetic field 
for a proper description and is a direct measure of the particle nature of light. 


1.2.4 Stationary Properties 


In mathematics and statistics, a stationary process is a stochastic process [12] whose 
joint probability distribution does not change when shift operations performed. Con- 
sequently, parameters such as mean and variance, if they are present, also do not 
change over time. Stationarity is an interesting property in time series analysis. 

In applied mathematics, the Wiener—-Khinchin theorem [13], states that the Auto- 
correlation Function (ACF) of a wide-sense stationary process has a spectral decom- 
position given by the power spectrum of the process. One of the effective ways for 
identifying stationary times series is the ACF plot [14]. For a stationary time series, 
the ACF will drop to zero relatively quickly. 


1.3 Quantum Random Resources 


Quantum random numbers can be generated from a physical quantum source of a 
coherent laser light to be splitting a beam of light into two beams and then measuring 
the power in each beam. Due to the light intensity in each beam fluctuates about the 
mean. Those fluctuations can be converted into a source of random numbers [15-17] 
being a stationary Poisson distribution. 
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1.3.1 ANU Resource 


The ANU Quantum Random Numbers Server is an open website [18] to offer true 
random numbers to anyone on the internet. Such random numbers are generated in 
real-time by measuring the quantum fluctuations of the vacuum. The electromagnetic 
field of the vacuum exhibits random fluctuations in phase and amplitude at all fre- 
quencies. By carefully measuring these fluctuations, ultra-high bandwidth random 
numbers can be generated. 

About 1 GB data streams are downloaded and 100 MB data streams are used for 
the testing. 


1.3.2 USTC Resource 


In the Key Laboratory of Quantum Information, USTC, and CAS, true random 
number sequences are generated [16]. This type of true random sequences supports 
advanced quantum communication devices of QKD systems [19]. 

More than 20GB quantum random number sequences are provided by USTC 
for random streams testing. Two data sequences are represented as USTCp (initial) 
and USTC (secure), respectively. About 100 MB data streams are selected for each 
sequence. 


1.3.3 Refined Properties 


From an analysis viewpoint, a Toeplitz hash algorithm has used to get an initial 
sequence USTCp as input and USTC sequence as output. Checking such refined 
variations, this is an interesting property for us to make a detailed identification. 

From a random testing viewpoint, initial sequences have some difficulties to pass 
NIST tests and secure sequences are ensured to pass NIST tests. Some refined dif- 
ferences on random characteristics could be distinguished. 


1.4 Variant Framework 


Various schemes following the top-down strategy are explored to use multiple mea- 
sures to partition special phase spaces from a top state set to multiple bottom states 
via multilevels of a hierarchy in combinatorial algorithms [20], image analysis and 
processing for many years. 

The conjugate classification [21] is proposed to apply seven measures in a hier- 
archy to partition the kernels of four regular plane lattices on n = {4,5, 7, 9} cases 
for 2D binary images. For 1D cellular automata sequences, global random behaviors 
are visualized in 2D maps. 
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For n-tuple bit vectors, the variant logic framework [22] is proposed, various 
applications are explored: 3D visual method on random number sequences [23], 
variant Pseudorandom Number Generator (PRNG) [24], computational simulation 
on quantum interactions [25], noncoding DNA analysis, bat echolocation [26], and 
stationary randomness [27]. 


1.5 Proposed Scheme 


For the convenience of testing stationary randomness on random sequences, we 
propose a testing system for a stationary random sequence with length N, multiple 
segments M are divided from the sequence by a given length m, a 2-tuple pair of 
measures can be extracted from a 0-1 segment that are the number of 1 element 
and the number of | pattern in the segment. All paired measures are composed of a 
sequence of M pairs of measures as an ordered measuring set with M elements. 

The pairs of the measuring sequence are directly separated as two independent 
measuring sequences to keep each parameter in the same order. A total of three 
sequences of distinct measures are constructed including two sequences on single 
measures and one sequence on 2-tuple measures. 

Following this approach, two sets of single measuring sequences are sorted as two 
1D numeric arrays as statistical histograms corresponding to 1D maps and the 2-tuple 
measuring sequence is sorted as a 2D integer array as statistic histograms being a 2D 
map. Under the controlling operations on the changes of shift displacement, multiple 
results of the three measuring sequences are transformed into 1D statistic histograms 
and 2D pseudo-color maps to show effective patterns from the generated sequence 
under various positions and conditions on a list of shift operations. 


1.6 Organization of the Chapter 


This chapter uses a testing system for a stationary random sequence on the system 
architecture in Sect. 2. In Sect.3, test results are provided for two quantum random 
sequences. From the results of the visual maps in Sect.3, result analysis and brief 
comparison are described in Sect.4. And finally in Sect. 5, the main results are sum- 
marized. 


2 Testing System 


To describe the testing system, diagrams are shown in Fig. 1. 
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Fig. 1 The architecture of testing stationary random sequences 


2.1 System Architecture 


This system is composed of five parts: Input, Shifted Transformation (ST), Segment 
Measurement (SM), Combinatorial Projection (CP), and Output. 

The input of the testing system is a selected 0-1 sequence and its output is com- 
posed of three maps, two in 1D and one in 2D for visual distributions, and three 
maximals to be processed by ST, SM, and CP modules, respectively. 

Further technical details are described in Chapter. Stationary Randomness of 
Three Types of Six Random Sequences on Variant Maps of this book. 


3 Testing Results 


Three quantum random sequences are selected from ANU and USTC resources. 

Typical results of testing stationary properties for three sequences in nine maps 
are shown in Fig. 2. Three sets of results are shown in Fig. 3a, b. In Fig. 3a, six values 
of r = {0, 16, 32, 96, 112, 128} are selected to show three pairs of corresponding 
maps: 1DP, 2DPQ, and 1DQ for three sequences on the top part. Nine 2D maps of 
maximal curves for r = 0 — 128 are shown to illustrate refined properties in sta- 
tionary random curves on the bottom column. In Fig. 3b, three maximal curves on 
three 2D maps are compared. In Fig. 4a—c, three larger maps on r = {48, 64, 80} 
are shown corresponding to (a) 1DP, (b) 2DPQ, and (c) 1DQ for three cases. Three 
larger maps of three maximal curves are shown in Fig. 5. 


3.1 Quantitative Measurements 


For a G map, let G, be an average variation, AG, be a region of variations and 
GÈ = AG,/G, be a variation ratio. In convenient in comparison, let {Max, Min} 
be the {largest, smallest} value on a maximal curve; Max-Min is its difference and 
|ANU — USTC| is an absolute difference between ANU and USTC measures. 
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Fig. 2, ANU, USTC and USTCp random sequences on 1DP, 2DPQ, and 1DQ maps 


Let (Max — Min)/|ANU — USTC| be a relative ratio between (Max-Min) and 
|ANU — USTC|. 


4 Result Analysis 


Nine maps in Fig.2 are in three columns. Three 1DP maps have similar distribu- 
tions in bell shapes to illustrate Poissonian distributions. Three 2DPQ maps are 2D 
distributions and there are different symmetric distributions. Maximal elements in 
ANU, USTC, and USTCp maps show stronger vertical oriented features. Three maps 
have a symmetry on left/right directions and have a broken symmetry on up/down 
directions. Pseudo-color pixels on three maps are shown in 3D shapes. Compared 
with three 1DP maps, three 1DQ maps have similar distributions and more narrow 
bell shapes to illustrate sub-Poissonian distributions. 

Six groups of results on shift r : {0, 16, 32, 96, 112, 128} are shown in Fig. 3a on 
the top columns and each group contains nine distributions in three columns. Three 
random sequences have stronger stationary randomness that makes all maps in the 
similar style with minor changes on shift operations. Larger maps onr = {48, 64, 80} 
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Fig.3 ANU, USTC and USTCọ random sequences on three maps and maximals (a), (b); a Three 
pairs of nine variant maps in six groups and three pairs of nine maximal maps; b Three 2D maps of 
three maximal curves for ANU, USTC, and USTCọo 


in Fig. 4a—c provide refined visual information to show their variations in details. 
Enlarged and larger maximal curves are shown in Figs. 3b and 5 for r : 0 — 128 as 
nine 2D maps with values of average variation and region of variations. From the 
maximal and minimal stationary regions, there are 1-2% variation ratios for 1DP 
and 1DQ and 5% variation ratios for 2DPQ observed. Three curves of maximals on 
three 2D maps are illustrated in Figs. 3b and 5. 
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Fig. 4 ANU, USTC, and USTCp random sequences random sequences on enlarged maps, r = 
{48, 64, 80}; a 1DP; b 2DPQ; c 1DQ 
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Fig. 4 (continued) 


4.1 Relative Ratios on Differences 
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USTCo 


Details of three maximal measures are compared in Table 1. Three parameters 
{Q,, AQ,, QË} on 1DQ maps have 1 values on Max-Min and |ANU — USTC| 
ratios; there are 81 on P, and 1.6 on PÈ and there are 65 on PQ, and 7.9 on PQF 


observed. 


From this set of testing results, two samples of ANU and USTC are showing 
similar stationary properties and USTCp with different stationary properties among 
the three sequences. Significant differences of relative ratios are observed from 2DPQ 


variation measurements. 
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Fig. 5 Three enlarged 2D 
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Table 1 Comparisons on three measures for ANU, USTC, and USTCo samples 


(Max — Min) /|ANU — USTC\: 


65 


1 


Qx AQx Or 
ANU: 13.961% 0.17761% 1.2722% 
USTC: 14.09% 0.18% 1.27% 
USTCp: 14.02% 0.18% 1.27% 
Min: 13.961% 0.17761% 1.27% 
Max: 14.09% 0.18% 1.2722% 
Max—Min: 0.129% 0.0239% 0.0022% 
|ANU —USTC|: 0.129% 0.0239% 0.0022% 
(Max — Min) /|ANU — USTC|: 1 1 1 
P, AP, pr 
ANU: 7.0352% 0.15472% 2.1992% 
USTC: 7.05% 0.13% 1.87% 
USTCo: 8.24% 0.14% 1.68% 
Min: 7.0352% 0.13% 1.68% 
Max: 8.24% 0.15472% 2.1992% 
Max—Min: 1.2048% 0.02472% 0.5192% 
|ANU —USTC|: 0.0148% 0.02472% 0.3292% 
(Max — Min) /|ANU — USTC|: 81 1 1.6 
PO, APO, PO 
ANU: 0.99245% 0.04791% 4.8276% 
USTC: 0.99% 0.05% 5.01% 
USTCo: 1.15% 0.05% 3.56% 
Min: 0.99% 0.04691% 3.56% 
Max: 1.15% 0.05% 5.01% 
Max—Min: 0.16% 0.00209% 1.45% 
|ANU — USTC|: 0.00245% 0.00209% 0.1824% 


7.9 


5 Conclusion 


J. Zheng et al. 


It is feasible to evaluate stationary randomness for a random sequence using the 
testing system. From three maps {1DP, 1DQ, 2DPQ}, maximals are identified for 
shift r : 0 — m. Three 2D maps of maximal curves provide refined characteristics to 
evaluate stationary randomness. Further explorations and applications are required 
to check the testing system on other applications. 
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Using Information Entropy to Measure A) 
Stationary Randomness of Quantum gek 
Random Sequences 


Weizhong Yang, Yamin Luo, Zhefei Li and Jeffrey Zheng 


Abstract Different statistical measurements can be used to determine stationary 
randomness for random sequences. This chapter proposes a testing scheme for ran- 
dom sequences using information entropy as measurements. Datasets are collected 
from University of Science & Technology of China (USTC), three quantum random 
sequences are selected for testing. Multiple results are created on three maps, entropy 
curves, and quantitative measurements of stationary randomness are compared. Three 
differences of Max-Min entropy variation ratios are bounded in [0.08, 0.09]% region. 
The whole structure has measurable stationary properties. 
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1 Introduction 


From a statistical viewpoint, various parameters of statistical process [2—4, 7] could 
be stationary invariant [6] under shift operations on random sequences. Using variant 
maps [8], it is a normal approach to transfer a long random sequence into 1D and 2D 
statistical distributions as three maps: 1DP, 1DQ, and 2DPQ [9]. For each map, it is 
easy to divide each number by the total number to transfer a counting number into 
a probability measure. By this way, three sets of probability measures can be gen- 
erated. Applying information entropy function to summarize all pairs of probability 
parameters, one map corresponds an information entropy measurement determined 
by the distribution for stationary randomness. 


2 Test Methodology 


The test for a stationary randomness requires a sequence with length N. For the given 
input sequence, multiple segments M are divided from the sequence by a given length 
m, a 2-tuple pair of measures can be extracted from a 0-1 segment that are the number 
of 1 element and the number of 1 pattern in the segment. All paired measures are 
composed of a sequence of M pairs of measures as an ordered measuring set with 
M elements. 

The pairs of the measuring sequence are directly separated as two independent 
measuring sequences to keep each parameter in the same order. A total of three 
sequences of distinct measures are constructed including two sequences on single 
measures and one sequence on 2-tuple measures. 

Following this approach, two sets of single measuring sequences are sorted as two 
1D numeric arrays as statistical histograms corresponding to 1D maps and the 2-tuple 
measuring sequence is sorted as a 2D integer array as statistic histograms being a 2D 
map. Under the controlling operations on the changes of shift displacement, multiple 
results of the three measuring sequences are transformed into 1D statistic histograms 
and 2D pseudo-color maps to show effective patterns from the generated sequence 
under various positions and conditions on a list of shift operations. 


2.1 Dataset 


2.1.1 USTC Resource 


In the Key Laboratory of Quantum Information, USTC, CAS, and quantum random 
number sequences are generated [5]. This type of true random sequences supports 
advanced quantum communication devices of QKD systems [1]. 
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Fig. 1 Methodology for information entropy testing stationary random sequences 


More than 20GB of quantum random number sequences are provided by USTC 
for random streams testing. Three sequences from eight sequences are selected from 
three stages (1 Initial, 2 Secure, and 4 Filtered). Each random sequence has a length 
of about 8 MB. 


3 Method 


3.1 Methodology 


This method consists of five steps (Fig. 1): Input, Shifted Transformation (ST), Seg- 
ment Measurement (SM), Combinatorial Projection (CP), and Output. 

The input of the testing system is a selected 0-1 sequence and its output is com- 
posed of three maps, two in 1D and one in 2D for visual distributions, and three 
maximals to be processed by ST, SM, and CP. 


3.2 Description of Steps 


The testing system consists of three steps: {ST, SM CP}. 


Input: X N = m M bit sequence; m segment length; M total segments; r shift 
length; 
Output: Three maps {1DP, 1DQ, 2DPQ}; Three Maximals {1DP,., 1DQ,, 2DPQ, } 
Process: Shifting r position from X to be Y = X (r) in ST. Making segment measur- 
ing sequences in SM and then projecting three measuring sequences as three maps 
and extracting three maximals in CP. 

Let X, Y be 0-1 sequences with N elements, ST takes the sequence X as input, 
then shift r position on the whole sequence to be the shifted sequence Y = X (r) 
(i.e., a cyclic shift right + or shift left —). 
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Y = X(r), YU) = XU +r], I +r(modN), (1) 
0< I < N; X[I], Y[I] € {0, 1} 


SM takes the shifted vector as inputted and divides the vector into M segments. For 
the ith sub-vector 0 < i < M on the jth position 0 < j < m, denoted as Y; j. 

This sequence at the end of sub-vectors after the segmenting operation forms an 
m * M matrix, m positions for the ith complete row vector in the sequence correspond 
to a pair of 2-tuple measures: (p;, qi). 


= (5 (2) 

= (Yio, Yii,- Yi js -eeo Yim—1} (3) 
0<i<M,0<j<m 

Sig eer ae (4) 

{Yj} > {i gi)! (5) 


The pair of 2-tuple measures (p;, qi) is determined by the following formula: 


Y; ; = Y[J] € {0,1}; J=ixm+j, (6) 
< O a SP 
-> Y; j, Yi j €{0, 1},0 < pi < m; (7) 
m—l1 

qi = IY; j1, Yj) == (0, 1)], (8) 
j=0 


j — 1(mod m), 0 < qi < |m/2]; 


That is, X = 0011010010, N = 10, M = 2, m = 5; (po = 2, qo = 1); (pı = 2, 
qı =2). 

The output from SM are M pairs of ordered 2-tuple measures {(p;, qD a 

CP consists of Split and Projection steps. Split adapts the 2-tuple measuring 
seene {(pi, qi yin re , splitting it into two independent measuring sequences: 
{ pikea a , {gi} M o to keep the original order variant 

The Three measure sequences are {pig ae : {qi} ae , {(Pi, a) fa : 

The Projection step turns the sequence into histograms: Project Array (PA), Color 
Map (CM), and Get Entropy (GE). For three measuring sequences, two types of 1D 
and 2D measures will be processed separately. 

The PA processes measuring sequences to transform them into integer arrays and 
the CM will organize them on either normalized histograms (1D measures) or color 
maps (2D measures), respectively. 
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The 1D measures involve two measuring sequences: { PAL a Ag 5 Let 
P[m + 1], Q[|m/2] +1] and NP[m + 1], NQ[|m/2] + 1] be two 1D (integer, 
float) arrays to represent the corresponding elements. 

The 1DP statistic histogram is generated from a sequence { pies PE NP, P 
two arrays (floating point, integer) with (m + 1) elements. For the jth element 
NP[j], PIJ], O0 < j < m, and 1DP, the entropy element, the output can be obtained 
by the following procedure: 


Initialization: VN P[j] = 0.0, 
P[j]=0,0 < j <m; 
Calculation: for(i = 0; i < M; i + +) 
{PIp:] ++; } 
Normalization: for(j = 0; j < m; j ++) 
{N PIJ] = P[j]/M; } 
Get Entropy: 1DP, = — 7.) N PLj] * loga (N P[j]) 


In the 1DP map, the PA corresponds to Initialization and Calculation; the MA 
handles Normalization and the GE determines the entropy element of the map. 

The 1DQ statistic histogram is generated from a sequence {qi} a NQ, Q two 
arrays (floating point, integer) with (|m/2] + 1) elements; For the jth element 
NQ{[j], OLJ], 0 < j < |m/2], and 1DQ, the entropy element, the output can be 
obtained from the following procedure: 


Initialization: VN Q[j] = 0.0, 
Q[j]=0,0 < j < |m/2]; 
Calculation: for(i = 0; i < M; i + +) 
{Qlqi] + +; } 
Normalization: for(j = 0; j < |m/2]; j ++) 
{NO[j] = OLj]/M; } 
Get Entropy: IDQ. = — D4"! NOLj] * loga(N QLI) 


Using P, NP, Q, NQ arrays, it is possible to generate corresponding 1D statis- 
tical histograms as 1D maps. 

In the 1DQ map, the PA corresponds to Initialization and Calculation; the MA 
handles Normalization and the GE identifies the entropy element of the map. 

The 2D measures specially processes one measuring sequence: {(p;, qi} Ta Let 
PQ, NPQ be two 2D (integer, float) arrays. 

A 2DPQ statistic histogram is generated from a sequence {(p;, qD}, PQ, 
NPQ 2D arrays with (m + 1) x (|m/2] + 1) elements. For the i, jth element 
P Qi, j],NPOQOLi, j],0 <i <m,0 <j < |m/2],and2DPQ, the entropy element, 
their values can be obtained by the following procedure: 
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Initialization: VP Q[i, j] = 0, 
O<i<m,0<j <\|m/2]; 
Calculation: for(i = 0;i < M; i + +) 
{(PQlpi. qi] + +; } 
Pseudo-color: Matching proper color for 
vPQ|[i, j],0<i<m,0< j< |m/2] 
Normalization: for(j = 0; j <m; j++) 
for(j = 0; j < |m/2); j ++) 
{NP Oli, j] = POL, j]/M; }} 
Get Entropy: IDPQ. = — Xt Eo NP Oli, j] * logo(N P Oli, jI) 


In the 2DPQ map, the PA corresponds to Initialization and Calculation; the MA 
handles Pseudo-color, Normalization and the GE identifies the entropy element of 
the map. 

Through the CP module, three measuring sequences are transformed into two 
1D arrays and one 2D array with (m + 1), ((Lm/2] + 1) and (m + 1) x ([m/2] + 1) 
clusters. 

The output of the testing system are three maps {1DP, 1DQ, 2DPQ} and three 
entropies {1DP,, 1DQ,, 2DPQ,} as expected statistic distributions and representa- 
tives of the input 0-1 sequence, respectively. 


4 Results 


Three quantum random sequences are selected from USTC {1, 2, 4} streams. 

Typical results of testing stationary properties for three sequences in nine maps 
are shown in Fig.2. Top part contains three 2D maps of global entropy curves on 
r = 0 — 128 condition. Three 2D maps of entropy curves for r = 0 — 128 are shown 
to illustrate refined properties in stationary random curves. Three sets of variant maps 
inr = Oand their enlarged entropy curves onr = 0 — 128 are shown in three columns 
to illustrate corresponding 1DP, 1DQ, and 2DPQ maps for three sequences. Three 
larger maps of three global entropy curves are shown in Fig. 3. 

For a G map, let G, be an average entropy variation, AG, be a region of entropy 
variations, and Gk = AG,/G, be an entropy variation ratio. Three entropy curves 
on three 2D maps are compared. Three entropy measurements and {Max, Min, Max- 
Min} values for three sequences are listed in Table 1. Three variation ratios and their 
numeric quantities are listed in Table 2. 


5 Result Analysis 


Three 2D maps of global entropy curves show stronger stationary randomness under 
shift operations on r = 0 — 128. Three entropy curves on each map are three stable 
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Fig. 2 Three USTC random sequences: {1, 2, 4} on 2DPQ, 1DP, and 1DQ maps and r = 0 — 128 
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Fig. 3 Three enlarged 2D 
maps of global entropy 
curves for three USTC 
random sequences 


1: Entropy curves 


2: Entropy curves 


4: Entropy curves 
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1.20 _infoentropy_of_each_step 


— Pinfoentropy 


— Qinfoentropy 
—— PQinfoentropy 


2.20 infoentropy_of_each_step 
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— Qinfoentropy 
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4 20 infoentropy_of_each_step 
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Table 1 Comparisons on three measures for three USTC samples 
Qe AQ OF 

1: 2.4537 0.63% 0.2559% 

2: 2.4638 0.83% 0.335% 

4: 2.46 0.86% 0.3481% 

Min: 2.4537 0.63% 0.2559% 

Max: 2.4638 0.86% 0.3481% 

Max—Min: 0.0101 0.23% 0.0922% 
P, AP, PR 

1: 2.9948 0.82% 0.2742% 

2: 3.1502 0.84% 0.267% 

4: 3.1472 0.61% 0.1937% 

Min: 2.9948% 0.61% 0.1937% 

Max: 3.1502% 0.84% 0.2742% 

Max—Min: 0.1554% 0.23% 0.0805% 
PO, APQ. POF 

1: 5.4397 0.81% 0.1481% 

2: 5.5998 1.14% 0.2035% 

4: 5.5932 0.66% 0.1183% 

Min: 5.4397 0.66% 0.1183% 

Max: 5.5998 1.14% 0.2035% 

Max—Min: 0.1601 0.48% 0.0852% 


Table2 Q, + P, : PQ, measures 


No. Qe Qe Oe + Pe PQ, Ae = |Qe + P, — PQ,| PQ./Ae 
1 2.4537 2.9948 5.4485 5.4397 0.0088 618 
2 2.4638 3.1502 5.614 5.5998 0.0142 394 
4 2.46 3.1472 5.6072 5.5932 0.014 400 


horizontal lines. From a global viewpoint, there are significant differences compared 
with entropy curves between No. | (PQ and P) and No. 2 & 3 cases. Both No. 2 and 
3 are in similar measures. 

Nine variant maps in 2DPQ, 1 DP, and 1DQ, three 2DPQ maps are 2D distributions 
and there are different symmetric distributions. Maximal elements in three maps 
show stronger vertical-oriented features. Three maps have a symmetry on left/right 
directions and have a broken symmetry on up/down directions. Pseudo-color pixels 
on three maps are shown in 3D shapes. Three 1DP maps have similar distributions 
in bell shapes to illustrate Poissonian distributions. Compared with three 1DP maps, 
three 1DQ maps have similar distributions and more narrow bell shapes to illustrate 
sub-Poissonian distributions. 

However, nine enlarged entropy curves for each type have significantly different 
variations and distributions. Local curves are bounded in narrow regions with random 
variations. 

It is difficult to tell detailed differences from entropy curves. Quantitative mea- 
surements in Table 1 are helpful to use numeric values in comparison. The difference 
of entropy variation ratios are on three sets, OF: [0.26, 0.35]%, pe [0.19, 0.27]%, 
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and P QÈ: [0.12, 0.20]%. Three Max-Min values of {Q%, PF, PQF} are bounded 
in [0.08, 0.09]%. The whole structure illustrates measurable stationary properties. 
In Table 2, it is interesting to notice that Q; + P, ~ P Qe. 

All variation measurements are shown in distinct stationary randomness to be 
measured by entropy approaches. 


6 


Conclusion 


Information entropy is a useful measurement to determine stationary randomness. 
Three quantum random sequences are used, distinct stationary randomness can be 
identified from both variant maps and numeric measurements. To explore various 
conditions of stationary properties, further investigations are required to explore 
theoretical boundaries on variant maps. 
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Visual Maps of Variant Combinations R) 
on Random Sequences giecik 


Jeffrey Zheng and Jie Wan 


Abstract Random sequences play the key role in network security applications. 
Randomness testing schemes are very important to ensure the randomness qualities 
for relevant sequences. This chapter proposes a visual scheme based on variant 
construction to measure sequences to intuitively show some combinatorial properties 
of key stream generated by stream ciphers. Basic models are described. This scheme 
provides a flexible framework for the variant measure method on the key stream of 
stream ciphers to describe randomness in various combinatorial maps. 


Keywords Visual scheme > Variant measure - Combinatorial projection 
Random sequence 


1 Introduction 


Random numbers play an important role in many network protocols and encryption 
schemas on various network security applications [1], for example, visual crypto, 
digital signatures, authentication protocols and stream ciphers. To determine whether 
arandom sequence is suitable for a cryptographic application, the NIST has published 
a series of statistical tests as standards. 
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In network security applications, the stream ciphers play a key role that have faster 
throughput and easier to implement compared to block ciphers [2]. RC4, the famous 
stream cipher, is suitable for large packets in Wireless LANs [3]. It has been used 
for encrypting the internet traffic in network protocols such as Sockets Layer (SSL), 
Transport Layer Security (TLS), Wi-Fi Protected Access (WPA), etc. [2]. 

eSTREAM project collected stream ciphers from international cryptology soci- 
ety [4] to promote the design of efficient and compact stream ciphers suitable for 
widespread adoptions. After a series of tests, algorithms submitted to eSTREAM are 
selected into two profiles. One is more suitable for software and another one is more 
suitable for hardware. Non-linear structures and recursive are playing the essential 
roles in new development. 

Different visual schemes are required to test randomness of random sequences on 
different stream ciphers. Along this direction, this chapter proposes a flexible frame- 
work to handle a set of mete measurements on different combinatorial projections. 


2 Variant Combinatorial Visualization 


Architecture of variant visualization is shown in Fig. 1. 
The variant visualization architecture is separated into four core components: 
EAC, SCC CC and VC. 


e RGC Randomness Generate Component generate a random sequence; 

e VSC Variant Statistic Component handles the statistic process using the variant 
measure method [5]; 

CC Combinatorial Component chooses combinations; 

VC Visualization Component makes visualization based on SCC measures and 
CC vectors. 


Combination 
} Vectors 


Visual 
results 


Rando 


Sequence ve 


Statistic 
Vectors 


VSC Variant Statistic Component; 
CC Combinatorial Component 
VC Visualization Component 


Fig. 1 Visualization architecture 
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The input n is the length of the binary sequence. The stream ciphers could be 
changed to any stream cipher that can generate binary sequence. This section focuses 
on the variant measure module and the visual method module. 

A visual example of RC4 will be described in Sect. 2.5. 


2.1 Variant Logic Framework 


The variant logic framework has been proposed in [6]. Li [7] used the variant mea- 
sure method to generate different symmetry results [5] based on cellular automata 
schemes [8]. Under such construction, even some random sequences show symmetry 
properties in distributions. 

Under variant construction, the variant conversion operator can be defined as 
follows: 


t,x=0,y=0 
+,x=0,y=1 

CEMA yoo 0) 
hæs khy= i 


It is convenient to list relevant variant logic variables shown in Table 1. 

In the variant measure method, each sequence is converting from binary 
sequence to probability which generated by counting the number of each variable 
in {L, +, —, T} and computes the probability of each variable. The measurement 
method is shown in Table 1. 


Table 1 The variant measure method 


(a) Counting method (b) Probability computing 
Variant variable | Number of type | Total number Measure Number of type 
parameters 
fl Ni N=N.+N74+ |P, N/N 
N+ N 
T NT Pr N+/N 
+ N+ Py Ni/N 
= N- P_ N_/N 


336 J. Zheng and J. Wan 


The variant measure method provides a set of results in measures of different 
0-1 sequences. The following mechanism can transfer stream cipher sequences as 
relevant measures. 

The essential models of variant scheme are described as follows. 


2.2 VSC Variant Statistic Component 


The VSC component converts the binary sequence to variant sequence in VCM 
module, and to compute probabilities and entropies in PECM module, respectively. 
The component is shown in Fig. 2. 


VCM Variant Conversion Module 
VCM module transfers input binary sequences by following steps: 


Step 1. Generate an n bit binary sequence S = S1 S2583... Sn by a stream cipher. 

Step 2. Shift X to left by M bit (M is the length of shifting) and generate a new 
binary sequence S’ = $1 S483... Sh m = SismSoam..- Sn- 

Step 3. Convert two sequences: S and S’ to a variant sequence V = V; = 
C(S;,S;), i=1,2,3...(a—M). 

Step 4. Separate V into n/N parts. N is the length of each part and M < N <n to 
form a set of variant sequence groups 


C216. Gian 
= (Vi, Va, e.s Vy}, vess {Vans Va-N4ls sees Vial} 


Step 5. Separate each item in G into N/M parts to establish a sequence group 


G={{{V,..., Vu}, ---, (VN-M+1;, ---, Vy }},..., 
{{Vn-N; -> Vi—-wem},---+{Vn—-m,--+s Vahh 


Variant 
Sequence 
i G 
Binary VCM roup PECM Entropy 
Squence Vectors 


VCM Variant Conversion Module 
PECM Probability and Entropy Computing Module 


Fig. 2 Variant statistic component 
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PECM Probability and Entropy Computing Module 

PECM converts a variant sequences group to separate it into several parts to com- 
pute probability and entropies. The equations computing the parameters have been 
described in Table 1. The main steps are performed as follows: 


Step 6. Compute the probability vector P = {P_, P}, P_, Pr} of each part in G’; 

Step 7. Calculate the distribute probability vector D = {D,, D+, D_, D+} of each 
part in G based on P vector; 

Step 8. Evaluate the entropy vector {E,, E,, E_, E+} from the D vector. 


2.3 CC Combinatorial Component 


IIn the CC component, it can be separated into two modules. One is SM module to 
form the vector selecting and another one is VDM module to perform the visualiza- 
tion. 

Visual data is a set of E vectors as input for VC. For E vector, choose a projection 
as a visual vector to compute the visual result from E vectors. So there will be 16 
visual results. 

Base on the same number of variables in a combination, the combination set can 
be integrated into 5 parts. i.e. The selected number of variables in the combination 
is in 0-4. 

Let the classification be EC = {E Co, EC), EC2, EC3, EC4}. Since the ECo is 
empty, it can be ignored. Only four distributions are of concern in Sect. 2.4. 


2.4 Visualization Component 


According to the variant measure method, in the rectangular axis, let Æ} be the 
positive axis of X, E+ be the negative axis of X, E, the positive axis of Y, E_ be the 
negative axis of Y. The axis is shown in Fig. 3. 

For EC, = {{F_}, {E+}, {E_}, {Et}, points are distributed to the axis. 

For EC, = {{E., Ey}, (EL, E-}, (EL, Ev}, (E+, E-}, (E+, Er}, (EL, Eth, 
points are distributed in the shadow area in Fig. 4. 

For EC; = {{E_1, Es, E_}, {E., E4, Er}, {E1, E_, Et}, {E+, E_, Etv}}, points 
are distributed in the area of EC, and the area of EC). 

For EC, = {{E_1, E}, E_, ET}, points are distributed in Fig. 5. 
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Fig. 3 Visualization axis 


Fig. 4 Distribution areas of 
EC2 


Fig. 5 Distribution areas of 
EC4 
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2.5 Example 


An example is given step by step to show how the algorithm runs. In the example, 
n, N and M are, respectively, assigned to 40, 16 and 8. 


Step 1. Input a 35 bit binary sequence, {010100101110101100101101011 
1101101010101} 

Step 2. Generates S’, {11101011001011010111101101010101}. 

Step 3. Generates V, {+T+—+LT+——TLT+—TL+T+T+—TLT—T-—+-T}. 

Step 4. Separate V into a G vector The G vector is 
{{#T +-+LT+—-—-TLT+—-T},{L+T+T+-—-TLT—-—T—+-—T}}. 

Step 5. Separate the G into the G’ vector. The G’ vector in the example is 
{{+T +-+1LT+,--— TLT+—-T},{L+T+T+-T,LT-—T—+-— TH}. 

Step 6. Generate probability vector P of each sequence in G’. The P vector of 
{+T +— + LT+} is {P1 = 0.125, P, = 0.5, PL = 0.125, Pr = 0.25}. 

Step 7. Compute the distribute probability vector D of each sequence in G from P. 
The D vector of {+T +— + LT+, — — TLT +—T} is shown in Fig. 6. 


Step 8. Compute the entropy vector E of each sequence in G from D. The E vector 
of {+1 +—+ LT +—-— TLI +—T}is shown in Fig. 7. 


D, = {P0125 =1 } 


D- = {Po2s = 0.5, Po 725 = 0.5} 


Fig. 6 D vectors of {+T +— + LT+, TLT +-T} 


E, = (Pores log Po.125 , )=0.0 


E+ = — (P9.25 log Po.25 + Po.725 log Po.725) = 0.693147 


Fig. 7 E vectors of {+T +— + LT 4 TLT +—T} 
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Fig. 8 Visual result of the example 


Step 9. Compute visual results from Æ vectors. In the E vectors of 


Step 10. 


{+T +-+LT+—-—TLT+-—T}. If the selection is {FE}, points will 
be (0.0, 0.0). If the selection is {E1 , E+}, points will be (0.0, —0.693147) 
and (0.0, 0.0). If the selection is {E7, E_}, points will be {E_ — |Ey|} = 
(0.0, 0.0) and (0.0, 0.693147). If the selection is {F,, E7, E_}, points 
will be {E1, E- — |E+|} = (0.0, 0.0) and (0.0, 0.693147). 

Separate visual results to EC classification. Visual results of the G in the 
example are shown in Fig. 8. 
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3 Result 


3.1 Visual Result of RC4 


The initial: {n : 128,000, N : 128, M : 16} 


The visual result (Fig. 9). 


The initial: {n : 128,000, N : 128, M : 24} 


The visual result (Fig. 10). 
The initial: {n : 128,000, N : 
The visual result (Fig. 11). 


1000, M : 8} 


The initial: {n : 100,000, N : 100, M : 24} 


The visual result (Fig. 12). 
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Fig. 9 Visual result of RC4 {n : 128000, N : 128, M : 16} 
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3.2 


The initial: {n : 128,000, N : 128, M : 16} 


Visual Result of HC256 


The visual result (Fig. 13). 
The initial: {n : 128,000, N : 128, M : 24} 
The visual result (Fig. 14). 
The initial: {n : 100,000, N : 100, M : 8} 
The visual result (Fig. 15). 
The initial: {n : 100,000, N : 100, M : 16} 
The visual result: (Fig. 16). 
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Fig. 10 Visual result of RC4 {n : 128000, N : 128, M : 24} 
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Fig. 11 Visual result of RC4 {n : 128000, N : 1000, M : 8} 
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Fig. 12 Visual result of RC4 {n : 100000, N : 100, M : 24} 
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Fig. 13 Visual result of HC256 {n : 128000, N : 128, M : 16} 
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(b) Visual result of EC, 


(a) Visual result of EC, 
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(d) Visual result of EC, 


(c) Visual result of EC; 


Fig. 14 Visual result of HC256 {n : 128000, N : 128, M : 24} 
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Fig. 15 Visual result of HC256 {n : 100000, N : 100, M : 8} 
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(c) Visual result of EC; 


Fig. 16 Visual result of HC256 {n : 100000, N : 100, M : 16} 
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4 Conclusion 


The visual results show the similar symmetry property of sequences generated by 
RC4 and HC256. They are showing interesting distributions and can be significantly 
distinguished from their combinatorial maps. From our models and illustrations, 
various maps can be integrated by their combinatorial projections to show different 
spatial distributions on random sequences. Under this configuration, the variant mea- 
sure method provides a new analysis tool for stream cipher applications in further 
explorations. 
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Part VIII 
Applications—DNA Sequences 


Random numbers should not be generated with a method chosen at 
random. 


—Donald Knuth 


Natural selection is anything but random. 
—Richard Dawkins 


Biology is the most powerful technology ever created. 
DNA is software, proteins are hardware, cells are factories. 


—Arvind Gupta 


Initial approaches of variant construction on DNA sequences were developed from 
2012. For example, Randomness Measurement of Pseudorandom Sequence Using 
different Generation Mechanisms and DNA Sequence. Journal of Chengdu 
University of Information Technology. 27(6): 548-555, 2012; 2D Conjugate Maps 
of DNA Sequences, Journal of Information Security Vol. 4 No. 4 (2013), https:// 
doi.org/10.4236/jis.2013.44021; Pseudo DNA Sequence Generation of Non Coding 
Distributions Using Variant Maps on Cellular Automata. Applied Mathematics 5: 
153-174, 2014; Variant Map Construction to Detect Symmetric Properties of 
Genomes on 2D Distributions. J Data Mining Genomics Proteomics 5:150, 2014; 
Variant Maps to Identify Coding and Non-coding DNA Sequences of Genomes 
Selected from Multiple Species, Biol Syst Open Access 2016, 5:1. https://doi.org/ 
10.4172/2329-6577.1000153 and Mapping Whole DNA Sequence on Variant 
Maps, Asunam 2017: 1037—1040. https://doi.org/10.1145/3110025.3110140. 

This direction contains extensive results among various applications. 

This part of DNA sequences is composed of two chapters (23 and 24). 

Chapter “Variant Map System to Simulate Complex Properties of DNA 
Interactions Using Binary Sequences” describes to use binary sequences to simulate 
DNA interactions under four meta basis. Different stream ciphers and real DNA 
sequences are applied in comparison. Their maps are illustrated similarity and 
differences among selected sequences. 
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Chapter “Whole DNA Sequences of Cebus capucinus on Variant Maps” applies 
whole DNA sequences of Cebus Capucinus (White Face Monkey) on variant maps. 
This set of maps has shown in various distributions of complex characteristics. 
Further researches are required. 


Variant Map System to Simulate A) 
Complex Properties of DNA Interactions =" 
Using Binary Sequences 


Jeffrey Zheng, Weigiong Zhang, Jin Luo, Wei Zhou and Ruoyu Shen 


Abstract Stream cipher, DNA cryptography and DNA analysis are the most impor- 
tant R&D fields in both Cryptography and Bioinformatics. HC-256 is an emerged 
scheme as the new generation of stream ciphers for advanced network security. From 
a random sequencing viewpoint, both sequences of HC-256 and real DNA data may 
have intrinsic pseudo-random properties respectively. In a recent decade, many DNA 
sequencing projects are developed on cells, plants and animals over the world into 
huge DNA databases. Researchers notice that mammalian genomes encode thou- 
sands of large noncoding RNAs (IncRNAs), interact with chromatin regulatory com- 
plexes, and are thought to play a role in localizing these complexes to target loci 
across the genome. It is a challenge target using higher dimensional visualization 
tools to organize various complex interactive properties as visual maps. The Variant 
Map System VMS as an emerging scheme is systematically proposed in this chapter 
to apply multiple maps that uses four Meta symbols as same as DNA or RNA rep- 
resentations. System architecture of key components and core mechanism on the 
VMS are described. Key modules, equations and their I/O parameters are discussed. 
Applying the VM System, two sets of real DNA sequences from both sample human 
(noncoding DNA) and corn (coding DNA) genomes are collected in comparison with 
pseudo DNA sequences generated by HC-256 to show their intrinsic properties in 
higher levels of similar relationships among relevant DNA sequences on 2D maps. 
Sample 2D maps are listed and their characteristics are illustrated under controllable 
environment. Visual results are briefly analyzed to explore their intrinsic properties 
on selected genome sequences. 
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1 Introduction 


Stream ciphers [1, 2] play a key role in modern network security [3, 4] especially 
in multimedia network environments; its core component—pseudo random number 
generation mechanism [5—7]—1takes the central position in modern cryptography 
[8, 9]. Associated with advanced development of bioinformatics, advanced DNA 
sequencing and analyzing techniques [10, 11] have significantly progressed over the 
past decade. 


1.1 DNA Cryptography 


DNA cryptography makes joined research in the field of DNA computing and cryp- 
tography. Scholars over the world focused on this field and different results are 
published such as simulating DNA evolution [12], DNA pseudorandom number 
generator [13—16], DNA cryptography [9, 17, 18] and so on. However in current sit- 
uation, DNA cryptography is still at an earlier stage as an emerging area of advanced 
cryptography. 

In typical results of DNA cryptography on encryption, different coding schemes 
could be randomly selected. E.g. the algorithm in paper [17] applies an encoding for- 
mula to express the plaintext on DNA sequence: {00 > C,01— T,10— A, 11— G}; 
however in paper [18], the same author uses the coding formula {00— A, 01 >T, 
10— C, 11 — G} for the plaintext on DNA sequence. In encryption environment, all 
4! = 24 possible encoding methods could be equally used in different applications. 


1.2 Stream Cipher HC-256 


Stream ciphers are an important class of encryption algorithms. A stream cipher 
is a symmetric cipher which operates with a time-varying transformation on indi- 
vidual plaintext digits. The ECRYPT Stream Cipher Project (eSTREAM) [1] was 
a multi-year effort, running from 2004 to 2008, to promote the design of efficient 
and compact stream ciphers suitable for widespread adoption. HC-256 is a stream 
cipher designed to provide bulk encryption in software at high speeds while permit- 
ting strong confidence in its security. A 128-bit variant was submitted in 2004 as an 
eSTREAM cipher candidate; it has been selected as one of the four final contestants 
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in the software profile [2, 4] in 2008 as the most advanced scheme for stream cipher 
applications in advanced network environment. 


1.3 Large Noncoding DNA and RNA 


In relation to DNA analysis, visualization methods play a key role in the Human 
Genome Project (HGP) [19]. After HGP completed successfully, a public research 
consortium—the Encyclopedia of DNA Elements (ENCODE) were launched by the 
National Human Genome Research Institute (NHGRI) in 2003 to find all functional 
elements in the human genome as one of the most critical projects by NHGRI to 
explore genomes after HGP. 

In 2012, ENCODE released a coordinated set of 30 papers published in key 
Journals of Nature, Genome Biology and Genome Research. These publications show 
that approximately 20% of noncoding DNA in the human genome is functional while 
an additional 60% is transcribed with no known function [20]. Much of this functional 
non-coding DNA is involved in the regulation of the expression of coding genes [10]. 
Furthermore the expression of each coding gene is controlled by multiple regulatory 
sites located both near and distant from the gene. These results demonstrate that 
gene regulation is far more complex than was previously believed [11]. Mammalian 
genomes encode thousands of large noncoding RNAs (IncRNAs), many of which 
regulate gene expression, interact with chromatin regulatory complexes, and are 
thought to play a role in localizing these complexes to target loci across the genome 
[21]. Associated with different international projects, larger numbers of Genome 
Databases are established and mass Genome-wide gene expression measurements 
are developed. 

Due to huge amount of DNA sample collections and extremely difficulties to 
determine their variation properties in wider applications [19, 22—27], it is essential 
for us to extend advanced DNA analysis models, methods and tools in further exten- 
sions to explore emerging models and concepts to interpret complex interactions 
among complicated sets of DNA sequences in real environments. 


1.4 DNA Analysis 


DNA analysis plays a key role in modern genomic application [19]. The HGP is heav- 
ily relevant to advanced DNA sequencing and analysis techniques. DNA sequences 
are composed of four Meta symbols on {A, T, G, C} as basic structure. Classical DNA 
double helix structure makes the first level of pair construction of DNA sequences 
with A & T and G & C complementary structures as the first level of symmetric rela- 
tionships. A typical DNA sequencing result is shown in Fig. la. Four Meta symbols 
could be separated as four projective sequences. 
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In ENCODE, recent Genomic analysis results are indicated that encoded 
sequences have only 20% in human genomes and around 80% genomes look like 
useless sequences. Under further assumptions, it seems that additional symmetric 
properties are required to satisfy the second, third and higher levels of structural 
constructions to explore complex interactive properties [10, 11, 19-29]. 

In current situation, it is necessary for advanced researchers to shift targets in 
computational cell biology from directly collecting sequential data to making higher- 
level interpretation and exploring efficient content-based retrieval mechanism for 
genomes. Using higher dimensional visualization tools, their complex interactive 
properties could be organized as different visual maps systematically. 


1.5 Variant Construction and DNA 


Variant construction is a new structure composed of logic, measurement and visual- 
ization models to analyze 0—1 sequences under variant conditions. The further details 
of this construction can be checked on variant logic [30, 31], 2D maps [32, 33], vari- 
ant pseudo-random number generator [34], DNA maps [35] and variant phase spaces 
[33]. Since the variant system uses another set of four Meta symbols {L, +, —, T} to 
describe system, a typical correspondence shown in Fig. 1b may provides a natural 
mapping between DNA and variant data sequences. 

Since DNA sequences are played an essential role to explore different symmetric 
properties based on analysis approaches, in this chapter, measurement and visual 
models are proposed systematically to use a fixed segment structure to measure four 
Meta symbols distributions in their spectrum construction. Under this construction, 
refined symmetric features can be identified from various polarized distributions and 
further symmetric properties are visualized. 
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1.6 Target of This Chapter 


The target of this chapter is to establish the Variant Map System (VMS) as a uni- 
fied framework to analyze complex DNA interactions on both artificial and natural 
DNA sequences. The VMS has designed to use variant logic schemes [30-35] apply- 
ing multiple maps on four Meta symbols as DNA or RNA representations. System 
architecture of key components and core mechanism on the VMS are described. Key 
modules, equations and their I/O parameters are discussed. Applying the VM Sys- 
tem, two sets of real DNA sequences from both human (noncoding DNA) and corn 
(coding DNA) genomes are collected in comparison with pseudo DNA sequences 
generated artificially by HC-256 to show their intrinsic properties in higher levels of 
similar relationships among DNA sequences on 2D maps. Further descriptions and 
discussions are provided respectively. 


2 System Architecture 


In this section, system architecture and their core components are discussed with the 
use of diagrams. The refined definitions and equations of this system are described 
in the next section—Variant Map System. 


2.1 Architecture 


The four components of a variant map system are the Binary To DNA (BTD), the 
Binary Probability Measurement (BPM), the Mapping Position (MP), and the Visual 
Map (VM) as shown in Fig. 2. 

The architecture is shown in Fig. 2a with the key modules of the four core com- 
ponents being shown in Fig. 2b—e respectively. 

In the first part of the system, the t-th sequence Y‘ on either {0, 1} or {A, G, T, 
C} are input data to get into the BTD module. The main function of the BTM is to 
output a unified sequence X‘ either to transfer a 0-1 sequence or to keep a DNA 
sequence as a pseudo or pure DNA sequence under a set of controlled parameters. 

Using this unified DNA sequence, four vectors of probability measurements are 
created from the t-th selected DNA sequence with N, elements as an input. Multiple 
segments are partitioned by a fixed number of n elements for each segment; at least 
m; segments can be identified by the BPM component. Next component uses the four 
vectors of probability measurements and a given k value as input data, a pair of posi- 
tion values are created for each Meta symbol. Four pairs of values are generated by 
the MP component. Then, in order to process multiple selected DNA sequences, all 
selected sequences are processed by the VM component and each sequence may pro- 
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(d ) MP Mapping Position module is composed of three components: HIS, NH and PP 
(e ) VM module is itself: VM 


Fig. 2. Variant Map System VMS and key components a Architecture; a BTD component; b BPM 
component; c MP component; d VM component 


vide a set of pair values to generate relevant variant maps to indicate their distribution 
properties respectively. 

With eight parameters in an input group, there are three sets of parameters in the 
intermediate group and one set of parameters in the output group. 
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The three groups of parameters are listed as follows. 


Input Group: 
t An integer indicates the t-th DNA sequence selected, 0 < t < T 
r An integer indicates a relationship distance among elements in a binary 


sequence, r > 1 

mode An integer indicates the mode of elements in a sequence, mode € {0, 1, ...}, 
mode = 0 for a DNA sequence, mode = 1 for a binary sequence 

N, An integer indicates the number of elements in the t-th DNA sequence, N; >> 
r 


Y! An input data vector with N, elements, Y € {D™'| mod e=0; B^ | mod e=1} 
n An integer indicates the number of elements in a segment, n > 0 

V A symbol is selected from four DNA symbols {A, G, T, C} =D, V € D 
k An integer indicates the control parameter for mapping, k > 0. 


Intermediate Group: 


x! A unified DNA vector with N, elements, X‘ € D™: 
pr} Four sets of probability measurements with 0 < l < m,, V € D 
(xý, yy) } Four paired values, k > 0, V € D 


Output Group: 
[Mapy } Four 2D maps, V € D 


2.2 BTD Binary to DNA 


The BTD component shown in Fig. 2b is composed of one module: BTD itself. Five 
parameters are shown as input signals and one unified vector is generated by the 
BTD component as the output group. 


Input Group: 


An integer indicates the t-th DNA sequence selected, 0 < t < T 

r An integer indicates a relationship distance among elements in a binary 
sequence, r > 1 

mode An integer indicates the mode of elements in a sequence, mode € {0, 1, ...}, 
mode =0 for a DNA sequence, mode = 1 for a binary sequence 

N, An integer indicates the number of elements in the t-th DNA sequence, N; >> 
r 

y' An input data vector with N, elements, Y' € {D™'| mod e=0, B™ | mod e=1} 


Output Group: 


X' A unified data vector with N, elements, X' € D™ 
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The BTD component uses an input vector on either binary or DNA format as 
input, under a set of input parameters to process transformation. The output of the 
BTD component is composed of a unified vector of DNA format in a given condition. 


2.3 BPM Binary Probability Measurement 


The BPM component shown in Fig. 2c is composed of two modules: BM Binary 
Measure and PM Probability Measurement. Three parameters are listed as input 
signals; four vectors of binary measures are outputted from the BM component as 
an intermediate group and four sets of probability measurements are outputted as an 
output group. 


Input Group: 


n An integer indicates the number of elements in a segment, n > 0 
V A symbol is selected from four DNA symbols {A, G, T, C} = D, V € D 
X' A DNA vector with N, elements, Xt € D™: 


Intermediate Group: 


{Mi} Four 0-1 vectors with N, elements, Mj,(/) € {0, 1} = B, Mọ € BY, Ve 
D 


Output Group: 
{oy} Four sets of probability measurements with 0 < l < m,, V € D 


The BPM component transforms a selected DNA sequence to generate four 0-1 
vectors by BM module for the input DNA sequence. Then four probability vectors 
are generated by the PM module as the output of the BPM under a fixed length of 
segment condition. 


2.4 MP Mapping Position 


The MP component shown in Fig. 2d is composed of three modules: HIS Histogram, 
NH Normalized Histogram and PP Pair Position. Two parameters are listed as input 
signals; four histograms and four normalized histograms are generated from the HIS 
component and the NH component as intermediate groups respectively. Four paired 
values are generated by the PP component as the output group. 


Input Group: 


{oy} Four sets of probability measurements with 0 <1 <m,, V € D 
k An integer indicates the control parameter for mapping, k > 0 
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Intermediate Group: 


H (p")} Four histograms for relevant probability measurements, V € D 
Palo” )} Four normalized histograms for relevant probability measurements, V € 
D 


Output Group: 
f(x, yy) } Four paired values, k > 0, V € D 


The MP component uses probability measurements as input, under a given k 
condition to generate each relevant histogram and its normalized distribution. The 
output of the MP component is composed of four paired values controlled in a given 
condition. 


2.5 VM Visual Map 


The VM component shown in Fig. 2e is composed of one module: VM Visual Map. 
Three parameters are input signals. Collected all selected DNA sequences, four 2D 
maps are generated by the VM component as the output result. 


Input Group: 
Vt All DNA sequences are selected, 0 < t < T 
y! An input data vector with N, elements, Y' € 


{D™: | mod e=0; BN | mod ei} 
{(xf. yy) y Four paired values for the t-th DNA sequence, k > 0, V € D 


Output Group: 
[Mapy } Four 2D maps, V € D 


The VM component processes all selected DNA sequences as input to generate 
paired values for each sequence. The output of the VM component is composed of 
four 2D maps to show the final visual distribution for the system. 


3 Variant Map System 


In this section, definitions and equations are provided to describe the VMS. In addi- 
tion to the initial preparation, seven core modules are involved in the BTD, BM, PM, 
HIS, NH, PP and VM components respectively. 
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3.1 Initial Preparation 


Let r an input parameter make all pairs of elements with r distance in a binary 
sequence to be a pseudo DNA vector, mode a controlled parameter indicate various 
pairs of operations performed if mode > 1. Denote B = {0, 1} a binary base and 
D = {A, G, T, C} a DNA base respectively. 


3.2 BTD Module 


Let Y an input sequence with N elements,0 < Z < N, Y (I) € {BN | mod eo YU) € 
D | mod e=o }. This input vector could be expressed as follows. 


Y = (Y (0),...,Y(D),...,Y(N— 1), 0O<I<WN 
Y(I) = {B™ |mode>1, Y(I) € D” |mode=0}. (1) 


Let X denote a DNA sequence with N elements, D denote a symbol set with four 
elements i.e. D = {A, G, T, C}. This type of a DNA sequence can be described by 
a four valued vector as follows: 


X = (X(0),..., X(D,..., X(N — 1)), 
0<I<N,X(1) € D={A,G,T,C}, X € D” (2) 


From this input and associated parameters, following operations are performed. 
If mode =0, for all J, Y (I) € D, the output vector is equal to the input vector. 


VI,X() =Y(D),0<I1<N (3) 


If mode = 1, for all pairs of J and Z +r (mod N) elements of Y, YU), YU +r) € B, 
the 7-th output element X(/) can be determined by the corresponding conditions 
shown in Fig. 1b as follows. 


G, if Yd) =0&Y(l+rn =0 
ae A, if YM =0&Y(+n=1 ‘ 
M= , ifYD=1&YA+9)=0" 4) 


T 
C, fYOD=1&Yd+n=1 
In both conditions, X will be a unified vector with four values as the output of the 
BTD shown in Fig. 2b. 


E.g. Let a binary sequence Y = 100111001011, N = 12, three pseudo DNA 
sequences (r = 1,r = 2,r = 3) can be represented as follows. 


Y = 100111001011 
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X,=1 = TGACCTGAT ACC 

X,—=2 = TAACTTAGCACT 

X,=3 = CAATTCGACATT 
Y € B?, X € D? 


Selecting a certain r value, a relevant pseudo DNA sequence can be generated 
from an input binary sequence. 


3.3 BM Module 


For a given /-th element, four projective operators can be defined and denoted as 
{M1 0), Mc), Mr), Me (D). 


Mi= LEXUS A yy (= 1, if X(I) = G; Mr (1) 
A E 0, Otherwise; j E 0, Otherwise; i 
1, if X(I) =T; 1, if X(I) = C; 
= Mc(I) = f 5) 
0, Otherwise; 0, Otherwise; 


Applying the four operators to all elements, the DNA sequence X can be reorga- 
nized into the four binary sequences of 0-1 values. i.e. 
My : {X(D} o > (Ma), Me), Mr (I), Mc). 720 3 
My(1) € B = {0,1}, V € D (6) 


E.g. Let a DNA sequence X = CTGATTAGCCAT, N = 12, its four binary 
sequences can be represented as follows. 


X = CTGATTAGCCAT 
Ma = 000100100010 
Mg = 001000010000 
Mr = 010011000001 
Mc = 100000001100 


Itis interesting to notice that the basic relationship between a DNA sequence X and 
its four My sequences are exactly same as in a modern DNA sequencing procedure 
to separate a selected DNA sequence into the four Meta symbol sequences shown in 
Fig. la. This correspondence could be the key feature to apply the proposed scheme 
naturally in simulating complex behaviors for any DNA sequence. 

The projection My provides the essential operation in the BM component as the 
first module shown in Fig. 2c. 
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3.4 PM Module 


For this set of the four binary sequences, it is convenient to partition them into m 
segments and each segment contained a fixed number of n elements. 

For the /-th segment, let 0 < 1 < m,0 < j < n, the Z-th position will be 
I =l xn + j, four probability measurements {p,4, PG, Pr, Pc,} can be defined. 


(I+1)xn—1 
z My(I 
or = J=len v( Iy D,O<I<N=n*m (7) 
n 


Under this construction, four sets of probability measurements established. 


m—1 


p” :{Ma(1), Mc), Mr), Mc). Hy) > {ef ef. pf oF sb) 8) 


The probability operator oV generates four probability measurement vectors in 
the PM component as the second module shown in Fig. 2c. After the BM and PM 
processes, the whole procedure of the BPM component is complete in Fig. 2c. 


3.5 HIS Module 


Since the BPM generates four sets of probability measurement, it is necessary to 
perform further operations in the MP component shown in Fig. 2d as follows. 

In the HIS component as the first module in Fig. 2d, each probability sequence 
fo, po V € D can be calculated from n positions, at most n + 1 distinguished 
values identified in a vector. Under this organization, a histogram distribution can be 
established. 


Let H(.) be a histogram operator, for each position, it satisfies following relation, 


se OV . 
H(pf)= {Pr =m VED ©) 
0, Otherwise, 0 <i <n. 


Collecting all possible values, a histogram distribution can be established, 


m—1 


H(0") = YH (0) (10) 


1=0 


The histogram H (p") is the output of the HIS module. Four histograms are 
generated after HIS process. Further normalized process will be performed in the 
NH component as the second module in Fig. 2d. 
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3.6 NH Module 


Under this construction, a normalized histogram can be defined as 
Pa (p) = H(p")/m (11) 


After the NH component processed, its output provides the PP component for 
further operations as the third module in Fig. 2d. 


3.7 PP Module 


Relevant probability vectors have (n + 1) distinguished values; four sets of normalized 
vectors can be organized as a linear order as follows, 


m—1 > 
i . 
pt = Da (ol ol = £ )/m, O<i<n (12) 
1=0 
Under this condition, four linear sets of probability vectors are established, 
Pafo”) = [pi PF; Pi Pr hio 
př € [0,1], VeED,0<i<n (13) 


For four vectors, their components can be normalized respectively, 
n 
Sop) =1,VeED (14) 
i=0 


Four sets of probability vectors are composed of a complete partition on their 
measurements. 

Using this set of measurements, two mapping functions can be established to 
calculate a pair of values to map analyzed DNA sequence into a 2D map as follows. 

Lety = F(P, V,k)andx = F(P, V, 1/k) or (x5; Ww) be a pair of values defined 
by following equations, 


qi k 
y% = F(P, V, k) = (£r) & 
i=0 


Aere =E oy Ven (15) 
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In the PP component, four paired values are generated and each pair indicates a 
specific position on a 2D map for the selected DNA sequence. The core operations 
of three key components: BTD, BPM and MP for a selected sequence are performed 
in Fig. 2b-d. 


3.8 VM Module 


Since only one point of a 2D map is determined for a selected DNA sequence, it 
is essential to apply relative larger number of DNA sequences as inputs to generate 
visible distributions. This type of operations will be performed in the VM component 
shown in Fig. 2e. 

In a general condition, the VM component processes a selected data set fy" ip 
composed of T sequences, the t-th sequence with N, elements can be expressed 
by Y = (Y'(0),..., Y'(I), ..., Y'(N; — 1)), Y' € YU) € {BM |moae>1, YU) € 
DN" |mode=0}. Each sequence can be processed to apply the same procedures of the 
BTD, BPM and MP components. Since for each segment, its length n will be fixed for 
all selected sequences, it is essential to make number of segments be m' = |N, /n] 
in convention to match each sequence. Under this expression, the last module VM 
collects all T pairs of positions on relevant 2D visual maps as follows, 


1) 7-1 k k\t T-1 
VM : [xha > [hoi i > MAP), V eD (16) 


A sample 2D map of VM is shown in Fig. 3; this provides an assistant illustration 
for this type of visual maps on a case of multiple sequences. 

Under this construction, a total number of T DNA sequences are transformed as 
T visual points on four 2D visual maps that would be help analyzers to explore their 
intrinsic symmetry properties among four binary sequences. 
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4 Sample Results on 2D Maps 


Two types of data sets are selected for comparison. The first type of data sets are 
real DNA data sequences collected from both human and plan genomes to illustrate 
their differences on 2D maps. The second type of data set is collected from the 
Stream Cipher HC-256 to generate a pseudo random binary sequence under a certain 
condition. 


4.1 DNA Data Resources 


It is important to use some real DNA sequences to illustrate various test results of 
the VMS. Two sets of DNA sequences are selected and relevant resource features 
are described as follows. 

The first data set originally comes from the human genome assembly version 
37 and was taken from the reference sequences of 13 anonymous volunteers from 
Buffalo, New York. Hi-C technology [5] used to analyze chromatin interaction role 
in genome. From a genomic analysis viewpoint, this set of data may contain more 
complex secondary or higher level structures. A special structure nearly the GRCh37 
DNA sequence has been identified to explore their spatial characteristics. After pos- 
itive and negative sequencing, each data file contain 2700 DNA sequences and each 
sequence has around 500 elements stored in two files left and right respectively. 

The second DNA data set are selected from some plant gene database for com- 
parison. One set of DNA sequences of Corn genomes are stored in file 201-500 that 
contains 2700 DNA sequences and each sequence has around 200—600 elements. It 
may be ordinary single sequences without complex secondary structures. 


4.2 Pseudo DNA Data Resources 


The Stream Cipher HC-256 has being used to generate a binary sequence on a 
total length of 2700 x 500 bits in the file hc256 that has been partitioned as 2700 
subsequences and each sub-sequence in 500 bits. 

Using the VMS in various parameters, three sets of pseudo DNA sequences are 
generated and their 2D maps are illustrated, analyzed and compared in following 
subsections. 
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4.3 Sample Results 


Using the three files of DNA sequences and one pseudo binary sequence in three 
parameters, six sets of 2D maps are listed in Figs. 4, 5, 6, 7, 8 and 9 under different 
conditions to illustrate their spatial distributions using the VMS in a controllable 
environment. 

In Fig. 4, three groups of eighteen 2D maps are shown in the range of n = 3 ~ 
50, k = 7, N = 200 ~ 600, T = 2700 for comparison; (al—a6) six Map, maps 
for the file Right; (b1—b6) six Mapg maps for the file 201-500; (cl—-c6) six Map, 
maps for the file hc256 respectively. 

In Fig. 5, four groups of sixteen 2D maps for the file right are listed in the range of 
n = 15, k = {2,3,4,7}, N = 500, T = 2700; (a) group (al—a4) four Map, maps; 
(b) group (b1—b4) four Map; maps; (c) group (cl—c4) four Mapg maps; (d) group 
(d1—d4) four Mapç maps. 

In Fig. 6, four groups of sixteen 2D maps for the file hc256 are listed in the range 
of n = 12, k = {2,3,4,7}, N = 500, T = 2700,r = 1,mode = 1; (a) group 
(al—a4) four Map, maps; (b) group (b1—b4) four Map; maps; (c) group (cl—c4) 
four Mapg maps; (d) group (d1—d4) four Mapç maps. 

In Fig. 7, four groups of sixteen 2D maps for the file right are selected in the range 
ofn = 15, k = {2,3, 4, 7}, N = 500, T = 2700; (a) group (al—a4) four Map , maps; 
(b) group (b1—b4) four Map; maps; (c) group (cl—c4) four Map, maps; (d) group 
(d1—d4) four Mapç maps. 

In Fig. 8, three groups of twelve 2D maps for the file hc256 are compared in the 
range of n = 12,k = 7, N = 500, T = 2700,r = {1, 2,3}, mode = 1; (a) group 
(al—a4) four Mapy maps r = 1; (b) group (b1—b4) four Map, maps r = 2; (c) group 
(cl—c4) four Mapy maps r =3. 

In Fig. 9, three groups of twelve 2D maps for two files right and hc256 are 
compared in the range of k = 7, N = 500, T = 2700; (a) the file right n =15, 
mode=0; (b) the file hc256 n =12, mode=1, r =1; (c) the file hc256 n =12, 
mode= 1, r =3; (al—cl) Map, maps; (a2—-c2) Map; maps; (a3—c3) Map, maps; 
(a4—c4) Map maps. 


4.4 Result Analysis of 2D Maps 


Six groups of 2D maps contain different information, it is necessary to make a brief 
discussion on their important issues as follows. 

The first group of results shown in Fig. 4 presents three sets of eighteen 2D maps 
from three data files: right, 201-500 and hc256 undertaken various lengths of basic 
segment from 3 to 50 to illustrate their variations respectively. Six 2D maps of each 
group in Fig. 4 (al—a6) show significant trace on their visual distributions; the num- 
bers of main visible clusters identified are decreased when the length of segment 
has being increased e.g. (a3—a6). However lesser length of segment does not pro- 
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(a) (c) 
J w oS 5 4 : ~, - _ 
i w . ih w i» a g vē vs - w ‘ wwe w = @ es ; ‘ w i “ g 
(al) Map, k=2 (a2) Mapa k=3 (cl) Mapg k=2 (c2) Mapg k=3 
Xa sways, Too Pree ve = : = athe 
3 oe rec. : ne estes aida 
í i pon 
ESCALA, ee eR a a. a s — aa 
(a3) Map, k=4 (a4) Mapa k=7 (c3) Mapg k=4 (c4) Mapg k=7 


"EEEE EEEE 
i 
s D S eS he es e 


y ws ¥ ë x in v ë š š www r ws ‘ a) a 
(b1) Map, k=2 (b2) Mapr k=3 (dl) Mape k=2 
Da a 
Tai. 
eo, 
-eet > 
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Fig. 5 Four groups of sixteen 2D maps in the range of n = 15, k = {2, 3, 4,7}, N = 500, T = 
2700; a group (al—a4) four Map, maps; b group (b1—b4) four Mapy maps; ¢ (cl—c4) four Mapg 


maps; d (d1—d4) four Mapç maps for the file right 


vide refined visual distinctions with larger region in fuzzy areas e.g. (al—a2). From 
a structural viewpoint, middle ranged numbers of length provide better clustering 
results e.g. (a3—a5) for further analysis targets. To check another six 2D maps of 
Fig. 4 (b1—b6) for the file 201-500, significantly different visual distributions can be 
observed than (al—a6); the numbers of main visible clusters identified are decreased 
when the length of segment has being increased less significantly e.g. (b4—-b6). How- 
ever lesser length of segment does not provide refined visual distinctions with wider 
regions in fuzzy areas e.g. (b1—b3). In general, middle ranged numbers of length still 
provide better clustering effects e.g. (b4—b6) for further analysis purpose. To check 
six 2D maps of Fig. 4 (c1—c6) for the file hc256 r= 1, similar visual distributions can 
be observed than (al—a6) and significantly differences are observed than (b1—b6); 
the numbers of main visible clusters identified are decreased when the length of 
segment has being increased less significantly e.g. (c3—c6). However lesser length of 
segment does provide refined visual distinctions with regions in fuzzy areas e.g. (b1). 
In general, middle ranged numbers of length still provide better clustering effects 
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Fig. 6 Four groups of sixteen 2D maps in the range of n = 12, k = {2, 3, 4,7}, N = 500, T = 
2700 for the file hc256, r = 1, mode = 1; a group (al—a4) four Map, maps; b group (b1—b4) four 
Map, maps; € (cl—c4) four Mapg maps; d (d1—d4) four Mapc maps 


e.g. (c2—c4) for further analysis purpose. From their distributions, groups (a) and (c) 
have shared much stronger similar properties than group (b). 

It is interesting to observe different maps when control parameter k changed. 
Four groups of sixteen 2D maps for the file right are shown in Fig. 5 on the range of 
n = 15,k = {2,3,4,7}, N = 500, T = 2700; four groups in (a)—(d) provide four 
maps to share the same other parameters with different k values. Checking visible 
clusters in different maps, it is important to notice nearly same numbers of clusters 
identified in the same group, but different groups may contain significantly different 
numbers. Lesser k value (e.g. k =2) makes a tighter distribution and larger k value 
(e.g. k =7) takes better separation on the maps. Through k =7 maps provide better 
separation effects, it is easy to observe their y axis values already in 10% range. 

Four groups of sixteen 2D maps for the file hc256 are shown in Fig. 6 in the range 
ofn = 12,k = {2,3,4,7}, N = 500, T = 2700,r = 1. This group of 2D maps 
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Fig. 7 Two groups of eight 2D maps in the range of n = 15, k = 7, N ~ 200 ~ 600, T = 2700; 


a group (al—a4) four Mapy maps for the file left; (b) group (b1—b4) four Mapy maps for the file 
right 
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Fig. 8 Three groups of twelve 2D maps in the range of n= 12, k=7, N=500, T=2700 for the file 


hc256, r={1,2,3}, mode=1; a group (al—a4) four Mapy maps r=/; b group (b1—b4) four Mapy 
maps r=2; c group (cl-c4) four Mapy maps r=3 


can be compared with 2D maps in Fig. 5. Under the same parameters, similar visible 
effects and feature clustering properties could be observed if various k values are 
selected. 
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Using a set of selected parameters, two groups of eight 2D maps are compared 
in Fig. 7 for two files: left, right to explore higher levels of symmetric properties 
for secondary or higher levels of structures potentially contained in DNA sequences. 
Selected parameters are in the range of n = 15, k = 7, N = 500, T = 2700. Group 
(a) provides four Map, maps (al—a4) for the file left, group (b) uses four Map, maps 
(b1—b4) for the file right. 

In convenient description, let~be a similar operator, for groups (a) and (b), four 
pairs of {(al)~(b1), (a2)~(b2), (a3)~(b3), (a4)~(b4)} maps i.e. (left-A~right-A, 
left-T~right-T, left-G~right-G, left-C~right-C) have a stronger similar distribution 
between left & right. In addition, only two clustering classes could be significantly 
identified as {(al)~(a2)~(b1)~(b2), (a3)~(a4)~(b3)~(b4)} i.e. (left-A~right-A~left- 
T~right-T, left-G~right-G~left-C~right-C) respectively. This type of similar cluster- 
ing distributions may strongly indicate eight maps with intrinsically higher levels of 
DNA sequences with extra A-T and G-C pairs of symmetric relationships between 
two files: left & right. 

Using a set of selected parameters, three groups of twelve 2D maps are listed 
in Fig. 8 for the file 4c256, r={1,2,3} to explore properties for their higher lev- 
els of structures potentially contained in pseudo DNA sequences. Selected param- 
eters are in the range of n 12,k 7,N = 500,T = 2700. Group (a) pro- 
vides four Map, maps (al—a4) for r=/; group (b) uses four Map, maps (b1—b4) 
for r=2 (c) uses four Map, maps (cl-c4) for r=3. Using a similar operator, 
for groups (a—c), four pairs of {(al)~(bl)~(cl), (a2)~(b2)~(c2), (a3)~(b3)~(c3), 
(a4)~(b4)~(c4)} maps i.e. (A(r=1)~A(r=2)~A(r=3), ..., Cr= 1) ~C(r=2)~C(r=3)) 
have a stronger similar distribution among r={1,2,3}. In addition, only two clus- 
tering classes could be significantly identified as {(al)~(a2)~(b1)~(b2)~(c1)~(c2), 
(a3)~(a4)~(b3)~(b4)~(c3~c4) } i.e. three maps are shown in (A~T, G~C) respectively. 

In a convenient comparison, using a set of selected parameters, three groups of 
twelve 2D maps are compared in Fig. 9 for the files: right and hc256, r={1,3} to 
check their distribution properties contained in both DNA and created pseudo DNA 
sequences. Group (a) provides four Map, maps (al—a4) for the file right; groups (b) 
and (c) provide four Map, maps (b1—b4) for hc256, r=1 (c) and (cl—c4) for hc256, 
r=3. 

Using a weak similar operator ~, for groups (a—c), four pairs of {(al)~(b1)~(c1), 
(a2)~(b2)~(c2), (a3)~(b3)~(c3), (a4)~(b4)~(c4)} maps have a stronger simi- 
lar distribution between r={1,3} and a weak similar distribution on A and T 
cases. In addition, only two clustering classes could be significantly identified as 
{(al)~(a2)~(b1)~(b2)~(c1)~(c2), (a3)~(a4)~(b3)~(b4)~(c3)~(c4)} i.e. three maps 
are strongly shown in relationships among (A~IXT, G~C) for different cases respec- 
tively. 

In addition, this set of results illustrates directly visual comparisons with stronger 
similarity between DNA and pseudo DNA on VMS maps, their similarly clustering 
distributions may indicate those maps with comparable mechanism to express real 
DNA sequences with extra A-T and G-C pairs of symmetric relationships in their 
higher levels of relationships applying the Stream Cipher mechanism. 
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5 Conclusion 


This chapter proposes architecture to support the Variant Map System. Using a binary 
random sequence as input, a set of special pseudo DNA sequences can be generated. 
Under variant measures, probability measurement and normalized histogram, a pair 
of values can be determined by a series of controlled parameters. Collecting relevant 
pairs on multiple DNA sequences, four 2D maps can be generated. 

The main results of this chapter provide the VMS architecture description in 
diagrams, main components, modules, expressions and important equations for the 
VMS. Core models and diagrams, sample results are illustrated to apply two types 
of data sets selected from real DNA sequences and generated from the pseudo ran- 
dom sequences from the Stream Cipher HC-256 for comparison under the VMS 
testing. After proper set of parameters selected, suitable visual distributions could 
be observed using the VMS. Results in Figs. 4, 5, 6, 7, 8 and 9 provide useful evi- 
dences systematically to support proposed VMS useful in checking higher levels of 
symmetric/similar properties among complex DNA sequences in both natural and 
artificial environment. 

This construction could provide useful insights to spatial information on complex 
DNA expressions especially on large encoding RNA/DNA construction via 2D maps 
to explore higher levels of complex interactive environments in near future. 
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Whole DNA Sequences of Cebus A) 
capucinus on Variant Maps get 


Yuyuan Mao, Jeffrey Zheng and Wenjia Liu 


Abstract DNA sequences as a big data stream have been researched for years. 
However, researches on whole DNA sequences have various limitations to use exist- 
ing research methods. A new scheme is proposed to map whole DNA sequences 
as 2D maps in this chapter, the whole DNA sequence of Capuchin monkey (Cebus 
capucinus) in apes was used as an example to demonstrate the mapping results. 


Keywords Gene sequence - Cebus capucinus - Mapping method 
Sequential model - Variant map 


1 Introduction 


In modern biologics, DNA sequences are being sequenced from wider species from 
human to simple cells in DNA data banks as big data streams. It is difficult to 
process various DNA streams for classification and identification on various species 
from whole sequences. The main task of present genomic research [1, 2] is to obtain 
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more biological information by processing and analyzing of the DNA sequence from 
multi-angles and multilevels [4-7]. In recent years, the processing and utilization of 
biological gene data are being carried out in a variety of ways, such as gene feature 
extraction, gene sequence location [7—9], and so on. 

Variant map is an emerging technology to handle four symbols as meta-structure 
to process random sequences from cryptographic sequences, DNA sequences [3, 
10] to ECG signals. Multiple statistical probability distributions are generated from 
selected sequences to form 2D-3D visual maps in representation. This scheme makes 
whole data sequences more compact and effectively visualized, and mapping results 
may be useful to explore nonlinear complex behaviors of whole genomics. A whole 
DNA sequence of a night monkey has mapped [11] on variant maps. 

In this chapter, a special scheme is proposed to show a series of mapping results 
from a selected gene sequence of a capuchin monkey. 


2 Process Model 


A. Architecture 


The architecture of the process model is shown in Fig. la. The process model 
consists of five parts: input, processing, measurement, projection, and output. There 
are three modules: Processing, Measurement, and Projection. 

Input: A DNA sequence 

Output: A 2D map 

Modules: Processing, Measurement, and Projection 

Process: From a selected DNA sequence, multiple segments are divided by a 
fixed length m on the whole sequence sequentially in the Processing module. Each 
segment needs to count four symbols: {A, C, G, T} in the segment to transfer all 
segments into a measuring sequence of four measures in Measurement module. A 
special combination on X: {AT} and Y: {AG} is selected to determine four measures 
in a projection position and the whole measuring sequence projected to be a 2D map 
in Projection module. 


B. Processing Module 


From an input DNA sequence, multiple segments can be separated by a fixed 
length m to generate a sequence of segments. 

Input: a DNA sequence 

Output: a sequence of segments 


C. Measurement Module 


In this module, shown in Fig. 1b, each segment counts four numbers of {A, G, C, 
T } in each proportions, respectively. As the result, each count is an integer number 
between 0 and m to transfer a segment sequence into a measuring sequence of four 
measures. 
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Input: a sequence of segments 
Output: a sequence of four measures 


D. Projection Module 


The projection module is shown in Fig. lc as two units: Position and Projecting. 
For each four measures, two axis positions are determined by X(AT) and Y(AG), 
respectively. When all measures are processed, a 2D histogram is established as a 
statistical distribution as a 2D map. 

Input: a sequence of four measures 

Output: a 2D map 


(a) 


Input: A Output: 2D 
sequence maps 


(b) 


Input: 
{segment} 


Output: 
{four measures} 


(c) 


Position Projecting 


Input: {four 
Measures} 


Fig. 1 Architecture of mapping scheme (a)-(c). a Architecture; b Measurement module; ¢ Projec- 
tion module 
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3 Details 


A. Relevant Parameters 


m: segment length 
V: Two bases of combination: {AT, AG} 


num(AT) = num(A) + num(7); 
num(AG) = num(A) + num(G); 


P, = num(V) 


P,: The proportion of a base or combinatorial base 
(X Pir» Yp,,): a pair of XY mapping positions. 


B. Parameter in Module 


Since the output quality of generating maps is dependent on the number of projec- 
tion points, it is necessary for a refined map to include a larger number of coordinate 
points. The mapping projection forms the superposition to add up a larger number 
of coordinate points in 2D histogram representing a color map. 


C. Measurement module. 


m: subsection length of a DNA sequence 

num(A7) = num(A)+num(7) 

V: AT or AG, {AT, AG} €D. 

P,: The proportion of AT or AG on the length of the sequence M. 
P, = num(V)/m 

P: The proportion of AT 

Pag: The proportion of AG 


(x AR Pe): a pair of XY mapping coordinates. i, j are different subsections. 


D. Parameter in Module 


Calculating the proportion of AT and AG in the subsection according to the basic 


rules of mathematics. Two proportions can form a coordinate (x pee a j: which 
AT AG 


map a point on the two-dimensional graph. 
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The mapping relation between x and y: 


X: Par 


Y : Pag 


It is necessary for a distinct graph that includes a large number of coordinate 
points. Only a large number of DNA sequences can get a large number of coordinates 
points and pretty projection results. The graphics projection module completes the 
superposition of a large number of coordinate points. 


4 Results Display 


4.1 Maps on Various Segmented Length 


Different parameters are shown in Fig. 2a-l for m = {20, 30, 40, 50, 60, 70, 80, 90, 
100, 120, 150, 200}, Fig. 3a-f for m = {54, 56, 58, 60, 62, 64}, Fig. 4a—d for m = 
{59, 60, 61, 62} and Fig. 5 for m = 60, respectively. 

In the map, similar color of pixels indicates the similar number of segments in the 
cluster. 


4.2 Brief Analysis 


From Fig. 2, it is interesting to notice that when m <50, maps have more symmetric 

properties than larger numbers. Changing segmented lengths, significant patterns 

appear in m = 54—64 region shown in Fig. 3 and refined lengths are shown in Fig. 4. 
From a visual observation, when m = 60, the map has shown the better effects. 


5 Conclusion 


Using the proposed mapping scheme, it is feasible to transfer a whole DNA 
sequence as a color map with significant visual features. In addition to mapping 
method and selected functions, a set of sample sequences in various segmented 
lengths illustrate colorful distributions as variant maps. 

Checking symmetric information among different maps, it is possible to identify 
specific spatial features under different configurations. 
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Cebus capucinus (white-faced sapajou) 
m=100 
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(j) Cebus capucinus (white-faced sapajou) 


m=120 
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(k) Cebus capucinus (white-faced sapajou) 
m=150 
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AT 


(1) 


Cebus capucinus (white-faced sapajou) 
m=200 


Fig. 2 Variant maps of Cebus capucinus on various segmented lengths (a)—(1) m = {20, 30, 40, 


50, 60, 70, 80, 90, 100, 120, 150, 200}. a m 
70; g m = 80; h m = 90; i m 


20; b m 


30; c m = 40; d m 
100; j m = 120; k m = 250; 1m = 200 


50; e m = 60; f m 
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(c) Cebus capucinus (white-faced sapajou) m=58 
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Fig. 3 Variant maps of Cebus capucinus on various segmented lengths (a)-(f) m = {54, 56, 58, 
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60, 62, 64}; a m = 54; b m = 56; c m 


58; d m = 60; e m = 62; f m = 64 
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(a) Cebus capucinus (white-faced sapajou) m=59 (c) Cebus capucinus (white-faced sapajou) m=61 
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40 40 
a 50 
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Fig. 4 Variant maps of Cebus capucinus on various segmented lengths (a)-(d) m = {59, 60, 61, 
62}. a m = 59; b m = 60; c m = 61; d m = 62 
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Cebus capucinus (white-faced sapajou) m=60 


40 


0 10 20 30 40 50 60 
AT 


Fig. 5 Variant maps of Cebus capucinus on segmented lengths m = 60 


Since this is an initial step to make a whole DNA sequence in mapping operation, 
further researches and explorations are required. 
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Part IX 
Applications—Miultiple Valued 
Sequences 


Experience without theory is blind, 
but theory without experience is mere intellectual play. 


—Immanuel Kant 


Make everything as simple as possible, but not simpler. 
—Albert Einstein 


Science cannot progress without reliable and accurate measurement 
of what it is you are trying to study. 
The key is measurement, simple as that. 

—Robert D. Hare 


Processing multiple valued sequences, it is necessary to use more complex structures in 
transformation. Various signals such as ECG, EEG, and BEC (Bat Echolocation Calls) 
were tested. From 2016, various papers were published on ECG processing. For 
example, Variant Maps on Normal and Abnormal ECG Data Sequences, Biol Med 
(Aligarh) 8:336. https://doi.org/10.4172/0974-8369.1000336; Mapping ECG Signals 
on Variant Maps, https://doi.org/10.1145/3110025.3110134; Visualization of P wave 
characteristics in ECG, https://doi.org/10.1109/CISP-BMEL2017.8302247. 

This part of multiple valued sequences is composed of two chapters (25 and 26). 

Chapter “Successful Creation of Regular Patterns in Variant Maps from Bat 
Echolocation Calls” processes BEC signals on variant maps to identify variant 
maps into two distinct groups. 

Chapter “Visual Analysis of ECG Sequences on Variant Maps” uses visual 
analysis of ECG sequences on variant maps; various normal and abnormal ECG 
sequences are selected in comparison. Significant characteristics of various distri- 
butions are observed. 


Successful Creation of Regular Patterns R) 
in Variant Maps from Bat Echolocation pieci 
Calls 


D. M. Heim, O. Heim, P. A. Zeng and Jeffrey Zheng 


Abstract We created variant maps based on bat echolocation call recordings and 
outline here the transformation process and describe the resulting visual features. 
The maps show regular patterns while characteristic features change when bat call 
recording properties change. By focusing on specific visual features, we found a 
set of projection parameters which allowed us to classify the variant maps into two 
distinct groups. These results are promising indicators that variant maps can be used 
as basis for new echolocation call classification algorithms. 


Keywords Echolocation - Algorithms - Morphometry - Fourier - Analysis 
Quaternions 
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1 Introduction 


The identification of echolocation calls is essential to the research and conservation 
of bat species [1]. However, automatic classification algorithms have not yet been 
proven capable of providing 100% correct classifications or getting close enough 
to this ideal performance [2]. Since our approach of using variant maps [3] shows 
already promising results, we are confident that it will continue adding valuable 
contributions to the field of automatic bat call identification. 

Automated bat echolocation call identification algorithms were developed since 
the late 1990s [4-7]. At that time, multivariate discriminant function analysis or 
neural networks were used for the classification of the calls. Since then, other methods 
have been applied, e.g., algorithms of pattern recognition [8], support vector machines 
[9], hierarchical ensembles of neural networks [9, 10], geometric morphometry [11], 
machine learning [12], CART [13], and random forest classification [14]. For a 
critical analysis of the performance of the applied methods, we refer to [2] and the 
references therein. 

Using variant maps for the classification of bat echolocation calls differ com- 
pletely from these conventional techniques. The main difference is the preprocessing 
step, where the recordings are transformed into variant maps. This step offers the 
possibility to analyze the bat call recordings from a completely different point of 
view. It provides additional degrees of freedom which allow a further optimization 
of the identification process, e.g., by supplementing the information obtained from 
a Fourier analysis of the bat calls. 

Our method to transform the bat call recordings is based on measures proposed by 
Zheng [15] in the 1990s to partition special phase spaces in binary image analysis. 
These methods were extended in the 2010s [3, 16] and successfully used to classify 
quantum interactions [17, 18], differently encrypted messages [19], and noncoding 
DNA [20, 21]. 

Similar to these works, we transform the bat call recordings using variant measures 
to obtain variant maps. Each recording contains several calls of one bat species. We 
used calls of four aerial-hawking bat species in this study. Recordings were made at 
three types of crop fields far away from woody vegetation. The created variant maps 
have a regular structure, but characteristic features vary strongly with each recording. 
These results show that variant maps can be used to extract usable information from 
bat echolocation recordings. 


2 Transformation 


The processed bat echolocation calls were recorded with a sampling rate of 500 kHz 
and saved as “raw” 16-bit audio files. In the following, we describe in four steps 
(A-D) how we transformed these files into variant maps. 
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Step A: From analogue to digital audio 
In a recording of data length N, the amplitude of the bat echolocation calls is stored 
in N samples. Each sample corresponds to a floating-point number of 16 bits. For 
simplicity, we transformed the floating-point numbers to integer numbers of 16 bits. 

Step B: From digital audio to quaternions 
Next, we transform the integer sequence into a sequence of four metastates {L, +, 
—, T} which resemble the quaternions {Bottom, Plus, Minus, Top}. For this step, 
we select the i-th sample A; and its next neighbor A;;, and define the difference 
AA = Aj41 — A; and local average L = (A; + Aj+1)/2. Additionally, we require 
the maximum Ajax and minimum Amin of the current sequence to define a middle 
value V = (Amin + Amax)/2 and we define a tolerance 7. Using these values, we 
transform the integer sequence A; --- Ay into a sequence of quaternions B,--- By 
using the rules 


if AA<T and L>V: B=T 
if AA<T and L<V: B=L 
if AA>T and A; > Ai: 8B; =— 
if AA>T and A; < Aip: Bi =+ 


As an example, the values T = 4 and V = 10 lead to the sequence 


A; |0|3|3|2]0]8|20|20|11 
Aiai|3|0/8/6/ 4/3] 15/18/13. 
B; (LIL +++- -TIT 


Step C: From quaternions to meta-measures 
We subdivide the quaternion sequence into segments of length M and obtain, in 
this way, S = N/M segments. For each segment, we define four meta-measures 
{M_, M, M_, M7}. One measure represents the number of associated quaternions 
in one segment. These meta-measures satisfy the relations 0 < M,,M,,M_, MT < 
M and M, + M, + M- + My = M. The quaternion sequence with N units is now 
represented by S segments where each segment contains four meta-measures. 

Step D: From meta-measures to variant maps 
There are many possibilities to combine meta-measures for the creation of variant 
maps [3, 15-21]. To transform the bat echolocation calls into 2D color maps, we 
defined for each segment of meta-measures the axis values X = M} + M, and 
Y = M, + M- + Mr. One Z value is obtained by counting the number of segments 
where one specific X—Y combination was found. Each Z value is represented by a 
color in an (M + 1) x (M + 1) matrix. 

As an example, we depicted in Fig. 1 the variant map of an echolocation call 
recording from the bat species Nyctalus noctula. It has a data length N = 967,139 
and we chose a segment length M = 237. At the position X = 80 and Y = 200 
marked by a white circle, the color indicates a value Z = 10. That is, we found 10 
segments where the conditions M} + M, = 80 and Mı + M- + M~ = 200 apply. 
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Nyctalus noctula R11_005 
950 Segment M = 237 and Data Length N = 967139 Z 


Ọ 


X 


Fig. 1 The variant map of an echolocation call recording from the species Nyctalus noctula created 
by following the processing steps A-D described in Sect. 2. We highlighted the position X = 80 
and Y = 200 by a white circle to illustrate the processing step D. At this position, the conditions 
M++ M1 = 80and M, + M- + My = 200 apply. Further visual features are discussed in Sect. 3 
in more detail 


White areas indicate regions without any projection point on this sequence. For a 
discussion of further visual features which appear in this figure we refer to Sect. 3. 

These types of maps offer the possibility to visualize long data sequences with 
>10° samples on compact matrices. We use this scheme to transform each bat call 
recording into a 2D color figure. It can be optimized for the identification of bat 
species, recording locations or times. 


3 Variant Maps 


Our main result is that all variant maps created from bat echolocation calls show 
regular patterns while characteristic visual features vary with each recording. In the 
following, we describe the data we processed in detail and discuss the visual features 
we observed. 
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3.1 Data Description 


We processed 44 files which were recorded in August 2012 in the Uckermark region 
(Brandenburg, Germany) [22]. Each recording contains only calls of one of the four 
European bat species Nyctalus noctula, Pipistrellus nathusii, Pipistrellus pipistrellus, 
or Pipistrellus pygmaeus. These files were recorded on arable fields cultivated with 
three different crop types: corn (C), rapeseed (R), or wheat (W). The record length 
varies between 30s and 2 min. 


3.2 Visual Features 


We transformed all 44 files of bat calls into variant maps by steps A to D described in 
Sect. 2. That is, we used the axis values X = M} + M, and Y = M, + M- + Mr 
and a segment length M = 237. By focusing on the visual features, we clustered the 
resulting maps into two groups. A typical member of each group is shown in Fig. 2. 

One group consists only of maps showing patterns which have two significant 
maxima with values > 10°. We call members of this group double-maxima maps. 
The example shown in Fig. 2a has maxima at the positions X =0, Y =237 and X = 120, 
Y =200. Besides these two maxima, there are distinct positions on diagonal areas 
with values of the orders 1-10. 

All other maps belong to the group of non-double-maxima maps. As an example, 
the map in Fig. 2b has its significant maximum at the position X =0, Y =237 while 
other projection regions have values of the orders 1-103. In addition, most values of 
interest are located around a diagonal region and form a slat band on the map. 

All 44 resulting maps are shown in Figs. 3 and 4. They are separated into double- 
maxima maps (Fig.3) and non-double-maxima maps (Fig.4). In principle, it is 
possible to further subdivide the variant maps by identifying additional visual fea- 
tures. However, since we did not yet find a direct connection between visual features 
and bat call properties, a further subdivision goes beyond the scope of this manuscript 
and will be the topic of a future publication. 


3.3 Discussion 


On all generated maps, the positions on the left-down triangle area are empty. This 
is because our choice of axis obeys X + Y > M. Empty positions in the right-upper 
area appear because the bat call recordings consist of discrete short pulses with a 
longer time period of silence in between. 

Similarly, other visual characteristics in the colored areas can be directly related to 
properties of the bat call recordings. As an example, a signal of constant frequency 
can be transformed into a single position on a variant map by choosing suitable 
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(a) Pipistrellus nathusii RO4_021 
ò Segment M = 237 and Data Length N = 1125351 Z 


200 10° 
150 10° 
Y 
100 10° 
50 10° 
0 tt) 
0 50 100 150 200 250 a 
X 
(b) Nyctalus noctula R11_005 


0 Segment M = 237 and Data Length N = 967139 Z 


X 


Fig.2 Variant maps of a Pipistrellus nathusii and b Nyctalus noctula, both recorded on a rapeseed 
field. The figures were created by applying the transformation process described in Sect. 2. a which 
shows a typical double-maxima map with two significant maxima, while b belongs to the group 
of non-double-maxima maps 
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Fig.3 These variant maps show double-maxima patterns. They have two significant maxima with 
values >10°. The axis ranges are the same as in Fig. 2. Each map origins from a bat echolocation 
recording on a corn (C), rapeseed (R), or wheat (W) field 


parameters. This means that by optimizing the variant map transformation, it is 
possible to focus on features of the initial bat echolocation call for the creation of 
variant maps. 

This is the first time to our knowledge that quaternion structures have been used 
to transform bat calls. Our transformation process could be used to add optimizing 
parameters to current bat call identification schemes and in this way form the basis 
for a new identification algorithm. 
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C08 R11 Wo6 


N. noctula 


C23 R03 RO3_1 


P. nathusii 


C08 C18 


P. pipistrellus 


C02 c09 


P. pygmaeus 


Fig. 4 These variant maps show non-double-maxima patterns. That is, they explicitly do not have 
two distinct maxima with values >10° in contrast to the double-maxima maps shown in Fig. 3 


4 Summary and Outlook 


We transformed 44 bat echolocation files into variant maps. All created variant maps 
have a similar structure and can be classified by focusing on specific visual features. 
As an example, we found a set of projection parameters which allowed us to classify 
the recordings into double-maxima and non-double-maxima maps. 

Features like this can be traced back to the signal nature of the recordings. In 
this way, variant maps offer the possibility to focus on individual features of bat 
echolocation calls. Since there are multiple numbers of possible combinations to 
create variant maps, we are very positive that a suitable projection combination can 
be found to fulfill our ultimate goal of identifying single bat species. 

In order to meet this target, it is necessary to process a much higher number of 
bat calls to create a sufficiently large database for the effective determination of 
possible projections and associated maps. This would form the perfect basis for the 
development of a new echolocation call identification algorithm. 
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Visual Analysis of ECG Sequences R) 
on Variant Maps giecik 


Zhihui Hou and Jeffery Zheng 


Abstract This chapter presents the variant measurement based on the variant logic, 
which uses the ECG sequence as the signal source, and outputs the variant maps 
of ECG sequences. It provides a supplementary study for ECG detection. Samples 
of ECG signal are collected from the First People’s Hospital of Yunnan Province. 
Under variant maps, main parameters of various interval values are checked and 
corresponding maps are illustrated. 


Keywords Arrhythmia - Visualization - ECG sequences - Variant map 


1 Introduction 


The world is concerned about the cardiovascular disease [1]. Mainly relying on the 
detection of ECG signals to promote research on related issues of cardiovascular 
diseases. The electrocardiogram represents cardiac function and graphic signals [2], 
which is an important means of diagnosing abnormal cardiac activity. 

ECG signals are the product of a wide range of clinical ECG techniques. In recent 
years, research methods for ECG signals have made significant progress, such as 
using machine learning [3], neural network, clustering [4], partial fractal dimension 
[5], wavelet transform [6], and other methods to classify the detection of arrhythmia. 
The most typical representative of the emerging ECG research method is ECG scatter 
gram [7-9]. 
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Fig. 1 The overall structure of the variant map for ECG 


The variant method is an emerging technique for dealing with spatial changes in 
signal phase. Since the 1990s, the application of the variant method in processing 
binary image classification and transformation [10, 11] had been proposed, and the 
variant method has been perfected until now [12, 13]. Variant method is applied to 
different data samples: quantum sequences [14, 15], random sequences [16], non- 
coding DNA [17-19], bat echo signals [20], and electrocardiographic signals [21, 
22], and effective research results have been obtained in these samples. 

This chapter is a further study of the use of variant measurements in the detection 
of ECG sequences. The sample ECG signals are provided by the First People’s 
Hospital of Yunnan Province. In this chapter, two groups of signals are used: normal 
ECG signal and abnormal ECG signal groups. In the second part of this chapter, we 
describe variant map for ECG. Showing sample results and making a brief analysis 
in the third part, the last part is the summary of the chapter. 


2 Variant Map for ECG 


Variant map for ECG is composed of six parts: Input, Processing, Segmenting, Statis- 
tics, Mapping, and Output. Figure 1 is the overall structure of the variant map for 
ECG, which specific content about each part in the following description: 


A. Input Part 
Testing ECG signals are provided by the hospital as a data source. Let ECG signals 
be p with N elements. 


pH {ie ince Pri} 
B. Processing Part 


In processing part, a multivalve ECG signal sequence will be transformed into a 
four-valued pseudo-DNA sequence. 
Input: the ECG sequence 


p= { Po, ER prai} 


Parameters: W sliding window value; R interval value. 
Output: a four-valued pseudo-DNA sequence 
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q = {qo, ---, 9n-1} 


Processing: 


Let p; be an average value; r be a range value; t; be a conversion value. Three 
values are calculated in the equations: 


Pmax = max{pi},0 <i < N-1 


Pmin = min{pj},0 <i < N—1 


R 
r= (Pmax ae Pmin) * 2 


+, — 2hpi Pi) 
r*xR 


Transforming rules: 0 <i < N — 1 


if ti >R>O0O:q =A; if 0<t; < R:qi =G; 


if 0 >t >-—R:qi =C; if0>-R>t:q¢,;=T; 


C. Segmenting Part 


Input: q = {qo,---, qN-1}. 
Parameters: m is a segment value. 


Output: Q = {Qo,.--, Qj,---,Omu-i},0< J < M; M is segments and N = m * 
M. 


Processing: the j-th element in Q = {Qo. ove Ops a5 Ou-i}; 


OF — jaie ame LAMA | O<i <m,0 <j < M. 


D. Statistics Part 
Input: Q = {Qo,.-.,Qj,---,Ou-1},0<j <M 
Output: S = [sa SF, SF, st}.o <j<M 


S . is value of the number of A element in Q} 
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st is value of the number of C element in Q} 

s? is value of the number of G elementin Q ; 

S; is value of the number of T elementin Q} 
E. Mapping Part 
Selecting a pair of two elements in S = {s¢, SÇ, SF, sr}, 0<j<M,asa 
mapping object. This chapter selects (sF, sẹ). sF is corresponding to the X-axis 


and Ss? is corresponding to the Y-axis. All M pairs are mapping to the 2D map as 
output. 


F. Output 


The results of the mapping are output in the form of 2D variant maps. 


3 Sample Results and Brief Analysis 


Visualization results of ECG signal obtained by variant map for ECG show that the 
morphological features of ECG signals have regular changes. Sample results are 
illustrated and a brief analysis is described. 


A. Data Source Description 


The ECG signals in this chapter are provided by the First People Hospital of Yunnan 
Province. The ECG signals contain a total of 202,626 cases. There are 104,742 
normal cases and 97,884 abnormal cases of records. For this experiment, 97,884 
normal cases and 97,884 abnormal cases were selected. 

Since ECG signals have multiple attributes, this chapter chooses the attributes of 
the P wave samples to be processed. Figure 2 is the sample of part of abnormal ECG 
data source. 


B. Visualization Features 


Using the variant map for ECG, multiple maps can be generated. 

The interesting finding is that the changes of the parameters affect the spatial 
characteristics and phase changes of the maps. 

Overall in Fig. 3, two 2D maps are illustrated for two normal/abnormal maps, 
parameters are W = 24, R = 0.95, m = 50. X and Y are (sF, se) 0< j < M, the 
ECG variant map shows the regular characteristics. In Fig. 3a, a normal map for P 
wave is an oval. In Fig. 3b, an abnormal map for P wave is a stick. 

In Fig. 4, a list of normal maps for P wave on parameters R = 
{0.6, 0.72, 0.84, 0.96, 65, 1.08, 1.2}. When the parameter R increases, the feature 
of relevant maps has a nonlinear displacement along the top right corner of the 
image. 


Visual Analysis of ECG Sequences on Variant Maps 405 


STUDYINSTANCEUID CONCLUSION QRSDZRS WIDTP WIDTHR WIDTHT WIDTH QTC 
GRKS2014042900001 Rte Lshid. ST-TRRE 89 43 89 
JZNK2S2015052100019 tt Lehi > ST-TRE 68 
J2NK2S2015042400005 tt Lzhid# > ST-TRE 31 
JZNK2S2015060100008 Rit Lzhid’ > ST-TeRE -14 
J2NK2S2015061300006 Sit. Lzhid > ST-TRE -61 
JZNK3S2015083000004 St Lzhid > ST-TRE = 


465 
430 
459 
423 
480 
454 
472 
430 
460 
451 
514 
419 


J2ZNK3S2015072900012 Rit Ozhid! > ST-TRRE -4 
JZNK3S2015072700025 tt .Lzhit . ST-TRE 31 
J2ZNK3S2015091500013 Rte Lthids > ST-TRRE 90 
J2NK3S2015092600013 tt Uzhit > ST-TRE 
J2NK3S2015100600001 REWI > ST-TPRE 
J2ZNKS2014091500001 tt.Ushids > ST-TRE 

Nh 4 zipy r= 


a SI-I 


-K-K-K-X-K-K-K-K-K-K-K-55] 


Fig. 2 The sample of part of abnormal ECG data source 
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Fig. 3 The example of normal and not ECG variant map 


In Fig. 5, a list of abnormal maps for P wave on parameters R = 
{0.6, 0.72, 0.84, 0.96, 65, 1.08, 1.2}. When the parameter R increases, the feature 
of relevant maps has a nonlinear displacement along the top right corner of the 
image. 

Comparing with Figs. 4 and 5, differences between normal and abnormal map 
features. 


4 Summary and Prospect 


Electrocardiogram (ECG) detection is the key to clinical diagnosis of heart dis- 
ease and has important clinical value. At present, the automatic analysis function of 
dynamic ECG detection is not satisfactory. There are also problems that the features 
of waveform lesions are small and cannot be marked, and even the characteristics of 
lesions are neglected. Therefore, excavating the effective information existing in the 
massive ECG signal can avoid the blind area of ECG analysis to some extent, which 
has certain application value. 
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(a) Normal ECG Data;R=0.60 (b) Normal ECG Data;R=0.72 
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Fig. 4 A list of normal maps for P wave on parameters R = {0.6, 0.72, 0.84, 0.96, 65, 1.08, 1.2}; 
a-f maps on R = {0.6, 0.72, 0.84, 0.96, 65, 1.08, 1.2} 


This chapter presents a new scheme of statistical distribution, variant map for 
ECG. This method can process massive ECG data sequences as 2D maps with visual 
characteristics. The sample results show classification of arrhythmia characteristics 
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(a) Normal ECG Data;R=0.60 
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(e) Normal ECG Data;R=1.08 
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Fig. 5 A listof abnormal maps for P wave on parameters R = {0.6, 0.72, 0.84, 0.96, 65, 1.08, 1.2}; 
a-f maps on R = {0.6, 0.72, 0.84, 0.96, 65, 1.08, 1.2} 


to identify the normal ECG signals and abnormal ECG signals significantly different. 
Further explorations and more experiments are required. 
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